In this study, we investigated gender prediction in random
chat networks by using network topology statistics. Random
chat networks are quite different from most other well-studied
social networks, since the system places users in chats rather
than the users themselves, as is common in existing social
networks. In addition, all words in our dataset are masked,
removing contextual information, which is necessary in certain
environments due to increasing privacy concerns. Our study
found that by using network statistics, we are able to predict
gender significantly better than using masked word vector
features alone. Furthermore, our experiments show that in
this particular network, Degree is the most useful feature for
boosting gender prediction, followed closely by PageRank.
While Betweenness Centrality and Clustering Coefficient are
found to be less informative than others for gender prediction,
they are still significantly better than using no network features
at all.