# Sentiment Polarity using Adjective

Hatzivassiloglou et al (1997)1 identified and validated from a large corpus constraints from conjunctions (such as and, but) on the positive or negative semantic orientation of the conjoined adjectives.

Orientation (Polarity) = direction of deviation from the norm

## Approach

• Indirect information:
• Adjectives conjoined by “and” have same polarity, such as “simple and well-received”;
• Adjectives conjoined by “but” have different polarity, such as “simplistic but well-received”;

Green line represents and, Red dashed line represents but.2

## Data collection

21 million word 1987 Wall Street Journal corpus, automatically annotated with part-of-speech tags using the PARTS tagger.

### Adjectives data preparation

• Construct a set of adjectives with predetermined orientation labels by taking all adjectives appearing in the corpus 20 times or more and removing adjectives that have no orientation;
• Assign an orientation label (either + or -) to each adjective, using an evaluative approach
• Criterion: whether the use of this adjective ascribes in general a positive or negative quality to the modified item, making it better or worse than a similar unmodified item.
• Final set contained 1,336 adjectives (657 positive and 679 negative terms).

### Adjectives data validation:

They subsequently asked 4 people to independently label a randomly drawn sample of 500 of these 1,336 adjectives, who agreed with us that the positive/negative concept applies to 89.15% of these adjectives on average.

## Extract conjunctions between adjectives

By using two-level finite-state grammar, 13,426 conjunctions of adjectives, expanding to a total of 15,431 conjoined adjective pairs are collected.

### Test data

15,048 conjunction tokens involve 9,296 distinct pairs of conjoined adjectives (types).

Each conjunction token is classified by the parser according to three variables:

• the conjunction used (and, or, but, either-or, or neither-nor)
• the type of modification (attributive, predicative, appositive, resultative)
• the number of the modified noun (singular or plural)

## Validation of the Conjunction Hypothesis

• Prediction method 1 - Always predict same orientation:
• always guessing that a link is of the same- orientation type
• Prediction method 2 - But rule:
• Method 1 + using but exhibit the opposite pattern
• Prediction method 3 - Log-linear model:
• $\eta = \mathbf{\omega}^\mathbf{T} \mathbf{x}$, $y = \frac{e^\eta}{1 + e^\eta}$
• $\mathbf{x}$: the vector of the observed counts in the various conjunction categories
• $\mathbf{\omega}$: the vector of weights to be learned
• $y$: the response of the system
• Using the method of iterative stepwise refinement they selected 9 predictor variables from all 90 possible predictor variables
• Morphological relationships:
• Adjectives related in form almost always have different semantic orientations
• Highly accurate (97.06%), but applies only to 1,336 labeled adjectives (891,780 possible pairs)

## Cluster

### Input

A graph of adjectives connected by dissimilarity links.
Dissimilarity value $d(x, y)$ between 0 and 1:

• Small $d(x, y)$ $\Rightarrow$ same-orientation link between $x$ and $y$
• High $d(x, y)$ $\Rightarrow$ different-orientation link between $x$ and $y$

To partition the graph nodes into subsets of the same orientation, we employ an iterative optimization procedure on each connected component, based on the exchange method, a non- hierarchical clustering algorithm.

Objective function $\Phi$ scoring each possible partition $\mathcal{P}$ of the adjectives into two subgroups $C_1$ and $C_2$ as

where $C_i$ stands for the cardinality of cluster $i$

### Labeling the Clusters as Positive or Negative

The unmarked member almost always having positive orientation (Lehrer, 1985; Battistella, 1990). Thus:

They computed the average frequency of the words in each group, expecting the group with higher average frequency to contain the positive terms.

## Conclusion

They tested how graph connectivity affects the overall performance

1. Hatzivassiloglou, Vasileios, McKeown, Kathleen R (1997). Predicting the semantic orientation of adjectives, 174—181
2. 7 - 4 - Learning Sentiment Lexicons - Stanford NLP - Professor Dan Jurafsky & Chris Manning
