Python: Automatically Visualize a dataset, autoviz ver. 5

Visits: 1554 Today: 19 Total: 365264

I have changed scripts to get data from URL with Pandas.

I.What do you learn?

Let me show you how to use AutoViz, which visualizes a dataset with a line of code. It creates bar charts grouped by the columns of strings like these:

These chars are very useful for exploratory data analysis. I think other EDA libraries do not automatically create such kind of charts.

Some more related charts will be added if you give depVar=”nationality” as Ram Seshadri has commented on this post below. They show the differences by nationality in distribution and provide better insights into the variables.   

II.Data

The data I used is here:

https://pastebin.com/raw/cSZ8pYWh

Please download the above data and give a name like “TITLE.csv“. 

III.Scripts

You can see my sample codes here:

https://colab.research.google.com/drive/1o1tEz6IyJnPwpFVEs42E16Q0lZXCcJgT?usp=sharing

About shibatau

I was born and grown up in Kyoto. I studied western philosophy at the University and specialized in analytic philosophy, especially Ludwig Wittgenstein at the postgraduate school. I'm interested in new technology, especially machine learning and have been learning R language for two years and began to learn Python last summer. Listening toParamore, Sia, Amazarashi and MIyuki Nakajima. Favorite movies I've recently seen: "FREEHELD". Favorite actors and actresses: Anthony Hopkins, Denzel Washington, Ellen Page, Meryl Streep, Mia Wasikowska and Robert DeNiro. Favorite books: Fyodor Mikhailovich Dostoyevsky, "The Karamazov Brothers", Shinran, "Lamentations of Divergences". Favorite phrase: Salvation by Faith. Twitter: @shibatau

2 Comments

  1. Hi shibatau:
    This is a fantastic blog post. I noticed that you did not use the depVar variable option for AutoViz which I think would have given you some more “interesting” insights on your small sample data. For example, if you had given:
    depVar=”nationality”

    You would have gotten very interesting insights that would tell you whether scores in English or Japanese can help predict whether a person is Japanese or Nepalese. Also the kind of charts that AutoViz generates would also be totally different.

    SO may be time to do AutoViz blog post part 3?

    Thanks
    Ram

    • Hi Ram,
      thank you for your comment.
      I’ve put “nationality” as an argument for the depVar as you said.
      That’s great. I’ve been looking for a library for EDA like this.
      I appreciate a useful suggestion.
      Sure I will add some comments on the DepVar variable soon.

      Best,
      shibatau

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.