Event/Topic Analysis of News Articles [Cluster Analysis]

OOYUZ has a highly optimized clustering algorithm, specially designed for news clustering. [Learn More About Algorithm Link]

For example, lets perform cluster analysis for term “Barack Obama” : [Link]

On the landing page, news results are loosely clustered (grouped) according to news topics . Its good for a reading purpose.

There are three options provided at the very top of the page.

 

Key advantage of this algorithm is that user can select amount of homogeneity among articles in a cluster.

a) Grouped News : loosely grouped news articles for reading purpose.

b) Similar News : News articles grouped according to their topic/event.

c) Exactly Same News : News articles with same content but different publishers.

 

 

barack obama - News results clustered by topics-events - OOYUZ News Analytics (1)

 

 

Default page is the Grouped News. Its a reading mode. If you want to get news articles grouped according to topic/events, check “More Similar News” & if you want to see same news articles grouped together, click “Exact Same News”.

The later two options are extremely helpful in analyzing news articles. Below are screenshot of three options :

1) Grouped News

barack obama - News results clustered by topics-events - OOYUZ News Analytics (2)

 

View More Button loads all articles in a cluster

 

2) More Similar News

As you can see, news articles are now grouped more closely according to their topics/events. barack obama - News results clustered by topics-events - OOYUZ News Analytics (3)

 

3) Exact Same News

This is specially helpful for users who are here for in-depth analysis of news. It gives you all news articles with more or less, similar content.

 

barack obama - News results clustered by topics-events - OOYUZ News Analytics (4)

Only first two clusters are shown in above examples.

Click on any title, and get info about the news article.

barack obama - News results clustered by topics-events - OOYUZ News Analytics

 

When you open an article, you get various relevant information about the article, like its social count, approximate time to read, excerpt,source and related articles.

We have already discussed about how Timeline, Publisher & Social Media Analysis can be done using OOYUZ.

 

 

 

Social Media Analysis of News Articles

Analyse News for barack obama by Social Media Popularity - OOYUZ News Analytics

Apart from Timeline Analysis, Publisher Analysis & Topic/Event Analysis, OOYUZ provides social media analysis of news articles.

In this post, we shall explore features of OOYUZ for analyzing news articles according to their social media popularity.

For example, lets perform social media analysis for term “Barack Obama” : [Link]

                                            You get results like in the image below :

Analyse News for barack obama by Social Media Popularity - OOYUZ News Analytics (1)

As you can see above, news articles are divided in five categories. Names are self-explanatory. You can get articles which are most popular in social media to least popular.

You get few top articles on this page, and if you want to explore more articles in a category, you have to open button titled “More”.

Analyse News for barack obama by Social Media Popularity - OOYUZ News Analytics (2)

 

Here is the further in-depth analysis of “Viral Articles” for search term “Barack Obama” : [Link]

The “Viral Articles” is further divided in 5 Ranges. Range-5 has articles with highest social media popularity and Range-1 has the least.

barack obama - News articles according to their social media popularity - OOYUZ News Analytics

 

barack obama - News articles according to their social media popularity - OOYUZ News Analytics (1)

Lets open Range-5 to see most popular articles :

 

barack obama - News articles according to their social media popularity - OOYUZ News Analytics (2)

 

In the image above, most popular articles are listed.

When you open an article, you get various relevant information about the article, like its social count, approximate time to read, excerpt,source and related articles.

 

barack obama - News articles according to their social media popularity - OOYUZ News Analytics (3)

 

 

Cluster Analysis : A new approach to grouping similar news articles

Not only Publisher/Social/Timeline analysis, you get a very effective way to read news articles according to their similarity. This new way is the

 

OOYUZ is currently providing three levels of clustering : “Similar News”,  “More Similar News” & “Exact Same News”.

—- add screenshot

The purpose of these three categories is to provide clusters with different extent of similarity in their articles. When you click “Exact Same News” we make best effort to deliver groups containing articles that are exactly/~exactly similar to each other.

 

Learn more about cluster analysis :  [Link to technology page]

 

Publisher Analysis Of News Articles

 

 

News Results for barack obama sorted by publishers - OOYUZ

OOYUZ enables you to group results according to their sources and even further analyse each source for their timeline.

For example, searching for term “Barack Obama” [Link]  gives following results (image is truncated). You get list of publishers sorted by article count.

Timeline Analysis of News Articles

barack obama - Date timeline for news results - OOYUZ News Analytics

OOYUZ enables users to explore news in a variety of ways. One important criteria of sorting articles is by their timeline. For a search term, you can get articles timeline and option to further analyse each date for publishers.

For example, we are doing analysis for the term : “Barack Obama” : [Link]

View 1

barack obama - Date timeline for news results - OOYUZ News Analytics (1)

View 2

Here is an opened view :

barack obama - Date timeline for news results - OOYUZ News Analytics (2)

 

 

On the this page, you will be seeing few top articles for each date. There is an option for exploring each date further : “Analyse more by Publisher”.

Get all publishers who have published articles for a search term on a selected date.

Title - barack obama - News articles on 2014-11-26 - OOYUZ News Analytics

Title - barack obama - News articles on 2014-11-26 - OOYUZ News Analytics (1)

 

 

Publishers are sorted according to number of articles. Open any publisher to get all articles.

 

Title - barack obama - News articles on 2014-11-26 - OOYUZ News Analytics (2)

When you open an article, you get various relevant information about the article, like its social count, approximate time to read, excerpt,source and related articles.

Title - barack obama - News articles on 2014-11-26 - OOYUZ News Analytics (3)

 

Currently we are offering analysis of past 7 days.
OOYUZ offers you even more ways to explore news articles, for example, according to their social media popularity , publisher analysis or topic/event analysis.

 

Publisher Analysis, Date Analysis & Social Media Analysis @ OOYUZ.Com

OOYUZ is a news search + monitoring + analysis application that provides users to explore & analyse different aspects of news articles.

On OOYUZ, you can explore articles in different ways for your research before writing a detailed report on a subject, or monitoring a topic in news or simply reading/exploring news articles more smartly.

Currently, OOYUZ offers three ways of exploring news. We shall have a brief overview of it.

 

 

Technology

OOYUZ is a news search, monitoring & analytics application. It intends to give more power to user by allowing searches by a variety of parameters.

Technology :

For building OOYUZ, we developed several advanced text analysis algorithms [and we are still working on it]. Key technical features that make OOYUZ extremely  useful for news analysis :

1)Text Extraction :

For large scale crawling & data retrieval, we needed a high performance extraction. We built it from scratch. For this algorithm, we observed patterns across thousands of web pages. The algorithm, instead of selecting relevant text areas, deletes irrelevant text areas, making text extraction a highly optimized and high performing. Better text extraction is critically important for delivering relevant results.

2) Clustering :

For reading news articles according to their topic, we developed a text clustering algorithm. You can get an overview of its working here : [Link]

Key advantage of this algorithm is that user can select amount of homogeneity among articles in a cluster.

a) Grouped News : loosely grouped news articles for reading purpose.

b) Similar News : News articles grouped according to their topic/event.

c) Exactly Same News : News articles with same content but different publishers.

Clustering is done in real-time.

3) Graphical Representation :

For building OOYUZ, we decided to use Snap instead of D3 to have a better control. Read blog post about it here.

4) Filtering :

Using NoSQL, doing date-analysis or publisher analysis is highly optimized in performance terms.

5) Hierarchical Design :

Layout of OOYUZ search pages for different parameters are design after keen observation of user behavior pattern. For example, in date analysis, user can get top articles for each date on one page, and if he/she requires it to explore more articles for a date, it can be done by one simple click. There user gets all publishers who have published for search term on that day. Similar for publisher analysis. Learn more about Publisher Analysis, Timeline Analysis, Social Media Analysis & Cluster Analysis.

Its All About Control : Why We Chose Snap.svg or D3 ?

Snap [link] is relatively new. But its modern. It provides an excellent starter tutorial as well [link]. To keep it precise, only thing that made us choose Snap (and not D3) is the control. I tried my hands first at D3. Its an awesome library for creating amazing graphs/charts in svg. Its modern and is full of examples. There is probably nothing there that is not already built using D3. But great power comes with great abstraction. D3 almost takes care of every detail. Looping is easy and it rules in data-binding. No other library, as far as I know, is as good as D3 in data-binding. Parsing data in virtually any format, and presenting it to the front can’t be faster. Then why we chose Snap ?

As already answered, better-control. While designing OOYUZ [link], we felt requirement of having a better control. Using Snap, we made following charts :

1
2
3
4
5
6

Snap enables me to do things the way I want to do it. That’s the first and foremost priority for me while selecting a library. Snap is modern is backed by creator [name] of raphael and is very neat and clean library.

After working for few days in D3, I realized its an overkill. I may be first and only person saying this seeing popularity of D3, but truth be told, I find it overkill. Its a good thing infact. D3 has got so many features. Almost everything is built-in. But what if you want to tweak things a bit ? Or coming from a CSS background ? or just want’s few charts and have comfortable amount of time to design them ? Hence, as in every language/library-war, selection of a libray depends on requirement and resources. Nothing is good and nothing is bad, its all relative. So far, I am in love with Snap, may be in near future, when we shall need creating number of charts for an another feature, we shall resort to D3 for providing all the tools needed to develop a chart under the sun.

About OOYUZ

OOYUZ is a news search, monitoring & analytics application. It intends to give more power to user by allowing searches by a variety of parameters.

 

Its a news search + monitoring + analysis application that provides users to explore & analyse different aspects of news articles listed below :

 

 

Learn More about Technology @ OOYUZ.

 

People :

OOYUZ is developed by two computer science engineers. You can find us (or get in touch) on Twitter :

Neha Bhatt & Akshay Bhatt

 

 

 

 

1 of 2
12