A Data-Driven Exploration of My Reading Journey
Introduction
Reading has always been an integral part of my life. From immersing myself in history books and spy novels to expanding my knowledge through statistical and machine learning texts, I have continually sought to broaden my horizons and quench my thirst for knowledge. Authors like Murakami have captured my imagination, while leadership and management books have offered invaluable insights into personal and professional growth.
Throughout my life, there have been pivotal moments that have shaped my perspectives and interests. By analyzing the trends in the books I’ve been drawn to, I hope to uncover how these pivotal moments have impacted my literary journey. Using Python, I’ve created a script to apply topic modeling to the notes I’ve imported from neoReader and Readwise, with the aim of determining the topics covered in the books I’ve read over the past two years. In this blog, I will present the process and results of this exploration, shedding light on the themes and genres that have shaped my reading history.
The Power of Digital Books
I often find myself reminiscing about when I would wander through bookstores, picking up books and feeling the magic of turning pages in my hands. Physical books hold a special charm; however, their limitations became more apparent as I delved deeper into the world of digital books.
One of the shortcomings of physical books is the difficulty in taking and organizing notes. Digital books, on the other hand, simplify this process, making it easier to build and manage a body of knowledge. Additionally, the convenience of carrying digital books while traveling or relaxing at the beach cannot be overstated.
Topic Modeling — A Pathway to Uncovering Book Genres
To analyze the book genres I’ve read, I turned to topic modeling, a technique that identifies patterns and structures within a collection of documents. By applying this method to my reading notes, I aimed to uncover common denominators among the books I’ve read and explore the impact of pivotal moments on my literary interests.
Topic modeling techniques such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) enable the discovery of hidden topics within a collection of documents. By implementing these algorithms in my Python script, I could generate clusters of keywords, which served as a foundation for further analysis.
nlp = en_core_web_md.load()
# Tags I want to remove from the text
removal = ['ADV', 'PRON', 'CCONJ', 'PUNCT',
'PART', 'DET', 'ADP', 'SPACE', 'NUM', 'SYM']
tokens = []
for highlight in nlp.pipe(df_highlights['Highlights']):
proj_tok = [token.lemma_.lower(
) for token in highlight if token.pos_ not in removal and not token.is_stop and token.is_alpha and len(token) > 2]
tokens.append(proj_tok)
tokens_concatenated = list(map(lambda x: ' '.join(x), tokens))
tokens_cleaned = list(map(lambda x: get_cleaned_string(x), tokens_concatenated))
dictionary = Dictionary(tokens)
dictionary.filter_extremes(no_below=5, no_above=0.5, keep_n=1000)
corpus = [dictionary.doc2bow(doc) for doc in tokens]
# Optimal model
topics_count = 15
lda_model = LdaMulticore(corpus=corpus, id2word=dictionary, iterations=100, num_topics=topics_count, workers = 4, passes=100)
# Print topics
lda_model.print_topics(-1)
# Visualize topics
lda_display = pyLDAvis.gensim_models.prepare(lda_model, corpus, dictionary, R=10)
pyLDAvis.display(lda_display)
# Save the report
pyLDAvis.save_html(lda_display, f'data/generated_html/index_{topics_count}.html')
The entire code can be found here
Harnessing ChatGPT to Unveil the Essence of Book Topics
With the keyword clusters identified, I employed ChatGPT, a powerful AI language model, to transform these clusters into meaningful topics. By providing ChatGPT with a prompt containing the keyword clusters, I obtained clear and concise topic titles that best encapsulated the essence of each cluster.
These AI-generated topic titles offered valuable insights into the genres and subjects covered in the books I’ve read, giving me a deeper understanding of my reading preferences and habits.
Q “in the list below there are multiple clusters defined by the list of words. for every list of words find an appropriate topic to describe the set of words: [[‘sleep’, ‘rem’, ‘hour’, ‘brain’, ‘night’, ‘percent’, ‘lose’, ‘morning’, ‘time’, ‘deep’], [‘new’, ‘network’, ‘link’, ‘idea’, ‘time’, ‘practice’, ‘assign’, ‘level’, ‘take’, ‘group’], [‘know’, ‘life’, ‘thing’, ‘love’, ‘people’, ‘learn’, ‘way’, ‘change’, ‘deal’, ‘single’], [‘time’, ‘change’, ‘end’, ‘percent’, ‘mean’, ‘build’, ‘experience’, ‘world’, ‘habit’, ‘seek’], [‘strategy’, ‘help’, ‘individual’, ‘life’, ‘goal’, ‘word’, ‘world’, ‘job’, ‘real’, ‘example’], [‘good’, ‘knowledge’, ‘human’, ‘strategy’, ‘work’, ‘future’, ‘find’, ‘new’, ‘pattern’, ‘value’], [‘action’, ‘resource’, ‘story’, ‘policy’, ‘great’, ‘life’, ‘high’, ‘good’, ‘give’, ‘ability’], [‘understand’, ‘say’, ‘percent’, ‘world’, ‘mind’, ‘truth’, ‘idea’, ‘control’, ‘human’, ‘field’], [‘day’, ‘turn’, ‘natural’, ‘change’, ‘important’, ‘new’, ‘book’, ‘potential’, ‘life’, ‘rate’], [‘cost’, ‘modern’, ‘dream’, ‘john’, ‘truth’, ‘end’, ‘create’, ‘build’, ‘product’, ‘fight’], [‘people’, ‘attention’, ‘second’, ‘book’, ‘person’, ‘read’, ‘small’, ‘day’, ‘state’, ‘mean’], [‘thing’, ‘person’, ‘go’, ‘individual’, ‘group’, ‘kill’, ‘type’, ‘identity’, ‘good’, ‘time’], [‘problem’, ‘experience’, ‘man’, ‘solve’, ‘learn’, ‘model’, ‘skill’, ‘ability’, ‘think’, ‘water’], [‘good’, ‘work’, ‘idea’, ‘think’, ‘feel’, ‘people’, ‘teach’, ‘life’, ‘look’, ‘fail’], [‘think’, ‘need’, ‘come’, ‘idea’, ‘strength’, ‘know’, ‘fact’, ‘good’, ‘hit’, ‘new’]]”
Based on the provided list of word clusters, I suggest the following topics to describe each set of words:
- Sleep Quality and Brain Function
- Networking and Collaborative Learning
- Personal Growth and Relationships
- Time and Adaptability
- Goal Setting and Personal Development
- Knowledge, Strategy, and Innovation
- Decision-making and Leadership
- Perception and Understanding
- Change and Personal Potential
- Creativity and Challenges
- Reading, Focus, and Learning
- Group Dynamics and Conflict Resolution
- Problem Solving and Skill Development
- Work Ethic and Personal Reflection
- Critical Thinking and Adaptability
Discovering Expected and Unexpected Themes
Some of the topics, such as ‘Sleep Quality and Brain Function,’ came as no surprise, given the extensive notes I had made while reading “Why We Sleep.” This reaffirmed the effectiveness of the topic modeling and ChatGPT-driven analysis. Other themes, like ‘Networking and Collaborative Learning,’ resonated with my personal interests in meeting new people and seeking collaborative opportunities.
Exploring Evolving Interests
The analysis also shed light on the evolution of my interests over time. As a lifelong learner, my initial focus on engineering and computer science expanded to include topics like ‘Personal Growth and Relationships.’ This shift highlights my growing awareness of the importance of having the right people around me to achieve great things.
Uncovering Core Values and Beliefs
The presence of topics such as ‘Time and Adaptability,’ ‘Goal Setting and Personal Development,’ and ‘Knowledge, Strategy, and Innovation’ provided insights into the core values and beliefs that have shaped my reading choices. I have always admired people who successfully reinvent themselves and recognize the significance of adaptability in the face of changing circumstances. My interest in strategy, coupled with the understanding that goals without strategy amount to wishful thinking, is also evident in these topics.
Reflecting on Reading Patterns and Personal Growth
The diverse range of topics revealed through this analysis not only painted a vivid picture of my literary journey but also provided an opportunity for introspection and personal growth. By understanding the themes that have piqued my curiosity, I can better appreciate the influences that have shaped my perspectives, as well as identify areas where I may wish to expand my knowledge or explore new ideas.
Conclusion — The Enduring Impact of Reading on Personal Growth
As I reflect on this exploration of my reading history and the patterns and themes that have emerged from the analysis, I am reminded of how reading has played an integral role in shaping my life. Alongside regular workouts, reading remains one of my greatest passions. It fuels my curiosity, broadens my perspective, and equips me with the knowledge and insights to engage in meaningful conversations on a diverse range of topics.
Looking back, I have come to realize that my personal success on a yearly basis is highly correlated with the number of books I’ve read. This connection highlights the profound impact that reading has on my personal growth, well-being, and achievements. By consistently engaging with new ideas, stories, and perspectives, I am able to continuously learn, adapt, and thrive in an ever-changing world.
I have made it a priority to read every day, as I recognize that without this daily nourishment for my mind, I experience a sense of hunger, a longing for the insights and inspiration that only books can provide. The journey through my literary landscape, powered by topic modeling and ChatGPT, has reaffirmed the importance of reading in my life and has inspired me to continue seeking out diverse and thought-provoking literature.
In an effort to connect with fellow readers and share my passion for books, I invite you to follow my reading list on Goodreads. This platform provides a space for us to exchange book recommendations, discuss our favorite reads, and suggest titles that we found interesting and fulfilling.
In the end, this exploration has not only provided a fascinating glimpse into the themes and genres that have shaped my reading journey but has also served as a powerful reminder of the transformative power of books. By continually feeding my mind and challenging my perspectives, I am better equipped to navigate the complexities of life and achieve greater success, both personally and professionally. I look forward to connecting with you on Goodreads and embarking on new literary adventures together.