Last Updated on June 21, 2022 by shibatau
Automatically visualize a dataset
Let me show you some sample scripts to use AutoViz, which visualizes a dataset with a line of code. You can target a particular variable. The following codes create some charts of the scores grouped by nationality:
# create charts with nationality targeted
filename = "/content/file1.csv"
sep = ","
dft = AV.AutoViz(
filename,
sep=",",
depVar="nationality",
dfte=None,
header=0,
verbose=0,
lowess=False,
chart_format="svg",
max_rows_analyzed=150000,
max_cols_analyzed=30,
save_plot_dir=None
)

You can see my sample codes here:
https://colab.research.google.com/drive/1o1tEz6IyJnPwpFVEs42E16Q0lZXCcJgT?usp=sharing
Hi shibatau:
This is a fantastic blog post. I noticed that you did not use the depVar variable option for AutoViz which I think would have given you some more “interesting” insights on your small sample data. For example, if you had given:
depVar=”nationality”
You would have gotten very interesting insights that would tell you whether scores in English or Japanese can help predict whether a person is Japanese or Nepalese. Also the kind of charts that AutoViz generates would also be totally different.
SO may be time to do AutoViz blog post part 3?
Thanks
Ram
Hi Ram,
thank you for your comment.
I’ve put “nationality” as an argument for the depVar as you said.
That’s great. I’ve been looking for a library for EDA like this.
I appreciate a useful suggestion.
Sure I will add some comments on the DepVar variable soon.
Best,
shibatau