Text Analysis Resources


  • LIWC: LIWC2015 is the gold standard in computerized text analysis. Learn how the words we use in everyday language reveal our thoughts, feelings, personality, and motivations. Based on years of scientific research, LIWC2015 is more accurate, easier to use, and provides a broader range of social and psychological insights compared to earlier LIWC versions. Check it out.
  • Coh-Metrix: Coh-Metrix is a system for computing computational cohesion and coherence metrics for written and spoken texts. Coh-Metrix allows readers, writers, educators, and researchers to instantly gauge the difficulty of written text for the target audience.
  • Personality Insights by IBM Watson: Go beyond artificial intelligence with Watson. With Watson APIs and solutions, businesses are already achieving outcomes – from improving customer engagement, to scaling expertise, to driving innovation and growth.
  • SentiStrength: Automatic sentiment analysis of up to 16,000 social web texts per second with up to human level accuracy for English - other languages available or easily added.
  • Perspective API: Perspective is an API that makes it easier to host better conversations. The API uses machine learning models to score the perceived impact a comment might have on a conversation. Developers and publishers can use this score to give realtime feedback to commenters or help moderators do their job, or allow readers to more easily find relevant information, as illustrated in two experiments below. Our first model identifies whether a comment could be perceived as “toxic” to a discussion.
  • Welcome to ProfilerPlus.Org!: ProfilerPlus.org provides free text analysis services for academic and non-commercial purposes using Social Science Automation’s Profiler Plus.
    • Leadership Trait Analysis
    • Motivations
    • Verbal Behavior Analysis
    • Persuasion Strategies
  • Epistemic Network Analysis
  • Events Data:
    • events: Store and manipulate event data: Stores, manipulates, aggregates and otherwise messes with event data from KEDS/TABARI or any other extraction tool with similar output
    • The GDELT Story: GDELT is the largest, most comprehensive, and highest resolution open database of human society ever created. Creating a platform that monitors the world’s news media from nearly every corner of every country in print, broadcast, and web formats, in over 100 languages, every moment of every day and that stretches back to January 1, 1979 through present day, with daily updates, required an unprecedented array of technical and methodological innovations, partnerships, and whole new mindsets to bring this all together and make it a reality. Creating a database of a quarter billion georeferenced records covering the entire world over 30 years, coupled with the massive networks that connect all of the people, organizations, locations, themes, and emotions underlying those events, required not only solving unparalleled challenges to create the database, but also a “reimagining” of how we interact and think about societal-scale data.
    • Open Event Data Alliance
  • NLP FOR THE SOCIAL SCIENCES: we present a number of freely available and user-friendly natural language processing tools for use in the social sciences. the tools run on a number of operating systems including mac and windows and provide measures related to lexical sophistication, text cohesion, syntactic complexity, lexical diversity, grammar/mechanics and sentiment analysis.
  • Political Positions:
    • wordfish: Wordfish is a computer program written in the R statistical language to extract political positions from text documents. Word frequencies are used to place documents onto a single dimension. Wordfish is a scaling technique and does not need any anchoring documents to perform the analysis. Instead, it relies on a statistical model of word counts. The current implementation assumes a Poisson distribution of word frequencies. Positions are estimated using an expectation-maximization algorithm. Confidence intervals for estimated positions can be generated from a parametric bootstrap.The name Wordfish pays tribute to the French meaning of the word “poisson”.
    • kbenoit/wordshoal: An extension package for quanteda for computing the “wordshoal” text model from Benjamin E. Lauderdale and Alexander Herzog. 2016. “Measuring Political Positions from Legislative Speech.” Political Analysis 24 (3, July): 374-394.
  • Negation:
  • stylometry:


-------------End of postThanks for your time-------------
BaoDuGe_飽蠹閣 wechat
Enjoy it? Subscribe to my blog by scanning my public wechat account