tThe design of hydrocarbon conversion processes currently demands more realistic modeling tools, withthe capability of predicting product distribution at the molecular level. This requires a detailed molecu-lar characterization, which generally is not available for complex feeds. The implementation of modelingtechniques to simulate the molecular composition of petroleum feedstocks from routine analyses repre-sents an alternative route towards molecule-based kinetic modeling. Nevertheless, generating a usefulmolecular representation depends heavily on the scope of the input analyses. This study aimed at verify-ing the requirements to construct a molecular representation that suits a detailed kinetic model, in termsof input data and model formulation. Two straight-run naphtha samples were selected as basis, in orderto compare the simulated composition against gas chromatography data, and afterwards the analysiswas extended to a light cycle oil. It was confirmed that it is possible to generate a synthetic mixture thatbehaves like the actual petroleum fraction in terms of bulk properties and even carbon number distri-butions. The comparison against gas chromatography data on the other hand, revealed that there aresignificant differences at the molecular level primarily due to the inability to predict all possible struc-tural combinations of alkyl groups, which increases exponentially with the carbon number. For practicalreasons then, it is proposed to work with a reduced set of chemically relevant species, built from a maincore structure (e.g. an arrangement of aromatic or naphthenic rings) and a reduced set of alkyl branchconfigurations, rather than considering a vast number of structural combinations.