The design of hydrocarbon conversion processes currently demands more realistic modeling tools, with the capability of predicting product distribution at the molecular level. This requires a detailed molecular characterization, which generally is not available for complex feeds. The implementation of modeling techniques to simulate the molecular composition of petroleum feedstocks from routine analyses represents an alternative route towards molecule-based kinetic modeling. Nevertheless, generating a useful molecular representation depends heavily on the scope of the input analyses. This study aimed at verifying the requirements to construct a molecular representation that suits a detailed kinetic model, in terms of input data and model formulation. Two straight-run naphtha samples were selected as basis, in order to compare the simulated composition against gas chromatography data, and afterwards the analysis was extended to a light cycle oil. It was confirmed that it is possible to generate a synthetic mixture that behaves like the actual petroleum fraction in terms of bulk properties and even carbon number distributions. The comparison against gas chromatography data on the other hand, revealed that there are significant differences at the molecular level primarily due to the inability to predict all possible structural combinations of alkyl groups, which increases exponentially with the carbon number. For practical reasons then, it is proposed to work with a reduced set of chemically relevant species, built from a main core structure (e.g. an arrangement of aromatic or naphthenic rings) and a reduced set of alkyl branch configurations, rather than considering a vast number of structural combinations.