The Zapping Insights: Indonesia’s Electric Vehicle

Agha Maretha
10 min readSep 5, 2022

--

Do you remember back when you were a kid, you used to watch a movie about flying cars in the future? Remember them being really awesome? Well, even though we haven’t reached that point yet, we are starting to have our foot on the start line with the next best thing, electric vehicles.

It has become quite the topic in the past few months, and like every hot topic, there are various discussions going around in the social media. Adding the press media and word of mouth, it would be a challenge to determine how does the people feel about this.

Traffic jam on Jl. Sudirman, Jakarta, on March 16, 2020.(Antara/Galih Pradipta)

Fortunately, Indonesia has a social media platform where people LOVES to write anything to their heart content, yup you guessed it right: It’s Twitter! Twitter will provide the data we need to see how the fellows indonesians react on the electric vehicles using twitter data.

In this analysis, we are going to find out the most mentioned things, related topics and the sentiment towards the electric vehicles in general using data analytics and machine learning methods to provide you with insights and recommendations for the potential electric vehicle officials for further improvement (fingers crossed!)

Spoiler Alert: This article is a (partially) end-to-end project. So, don’t be messed up with the technical stuff we wrote :)

Rationale

  1. We will be using the Twitter API to scrape the data from Twitter. This API can extract data for up to a week.
  2. Data that we’ll be using in this analysis are from July 15th until July 22nd, 2022. In this period, overall there are 3,983 unique Information that we can processed further
  3. Since we want to know the sentiment of Indonesian Society toward Vehicle Electric. The data that we’ll be extracting are tweets that are coming from Indonesian User’s
  4. Data that is extracted from Twitter are tweets that contain the following words :
  • Mobil Listrik
  • Motor Listrik
  • Kendaraan Listrik
  • Sepeda Listrik
  • Electric Vehicle

5. In this analysis the data set will contain the following information :

  • Twitter Username
  • User Tweet
  • Timestamp of the Tweet
  • Label is Verified Account (True if the Account is an Official Account)

Hypothesis

  1. There’s a certain Electric Vehicle that is more popular among the Indonesia User’s
  2. There’s a Specific Topic among Twitter User’s that is Trending during July 15th — July 22nd, 2022

Data Extraction

The data we collected are data that’s extracted directly from Twitter during a certain period. To extract the data from Twitter we need to create a twitter account and register as a twitter developer on the developer website. This step is essential so we could get the API Key and Token. For more on this you follow the step by step that have been created by Ahmed Besbes in this link. After setting up the account and API Key and Token we then could extract the data using the following script :

import tweepy#Write down the Key ( key and token below are for example purpose)api_key = 'your_api_key'
api_key_secret = 'your_api_key_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'
#authentication
auth = tweepy.OAuthHandler(api_key, api_key_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
#for search specific keyword/hashtag
keyword = 'kendaraan listrik'
#define extraction date (can be up to a week)
start_date = '2022-07-15'
end_date = '2022-07-22'
search_tweets = tweepy.Cursor(api.search, q=keyword, lang = "id", since=start_date, until=end_date, count=100, tweet_mode='extended').items(limit)

After Extracting data from tweepy we then want to store the information as a dataframe for further analysis :

data = []
columns = ['Time', 'User', 'Tweet', 'is_verified_acc']
for tweet in search_tweets :
data.append([tweet.created_at, tweet.user.screen_name, tweet.full_text, tweet.user.verified])
df = pd.DataFrame(data, columns = columns)

Data Preprocessing

Data Infos

We collected tweets from July 15th, 2022 until July 22nd, 2022 specifically in Indonesia that contains the words mentioned before. There are a total of 3983 rows with 4 columns including the tweet timestamp, username, tweet and an indicator whether the user is a verified account or not.

Fig 1. Example of the tweets

Pre-processing Steps

Tweets are basically texts, and there are certain delicates when handling texts to make sure the data we’re using for the analysis is tidy enough and easy to understand. Since there are limitless number of ways people on twitter could form their tweets (i.e. using emoji, certain punctuation marks, combination of words, etc.), we pre-processed the data with the steps listed below:

Analysis

Tweet Length Distribution

The manner in which the Twitter users discussing electric vehicles may vary, one simple way we could see this is through the length of the tweet. If it’s a short tweet, then it might suggest that the topic is being discussed in a conversation (replies and mentions) or a brief statement. If it’s a moderate to long tweet, then it might suggest that the topic is being discussed in a thread or in a long statement. Let’s take a look

Fig 2. Tweet Length Distribution

The plot above shows the distribution of tweet lengths in the data. You could see that our data spreaded from the length of 1 until 47, with most of them have the length of 16. This distribution piqued our interest that yielded in a hypothesis: Different tweet lengths may have different topics or focus of discussion. There might be some topics that doesn’t need to be explained in a long sentence, some might be better understood if explained otherwise.

Therefore, we separate the data into 4 different length groups:

  1. Length of 1 to 10
  2. Length of 11 to 14
  3. Length of 15 to 17
  4. Length of 18 and above

and we tried to see what words that occurs the most for each length group

Fig 3. Wordcloud for Tweet Length Distribution

If you take a look at the WordCloud for 1 to 10, 11 to 14 and 18 and above tweet length group, the focus of discussion (gauged from the word sizes) are not that much different. The most occurred words were rotating around “kendara”, “mobil”, “listrik”, “moeldoko”, etc. But the WordCloud for 15 to 17 tweet length group is notably distinct compared to the others. There most occurred words are “listrik”, “skuter”, “sepeda” and “larang”. This shows that for this particular group, the main focus of discussion is regarding the prohibition of the electric vehicles and it was discussed in a long sentence, implying that this specific topic might need to be explained in a holistic manner.

Trending Keyword

Using our dataset, we are looking for keywords that are often to be mentioned when people were talking about electrical vehicles. From the word cloud below, we found out that from our keyword specified before, the order of the electric vehicles that are most often mentioned (other than listrik word) are: mobil, kendaraan, sepeda, followed by motor.

Fig 4. Wordcloud for all words on all tweets

While excluding these vehicle type keywords, it turns out that people mostly mention “larang” along with their tweets.

Fig 5. Wordcloud for non electrical vehicle type keywords

Events Related with the Tweets

For additional EDA, we tried to see if there’s a certain time that the tweets behaving differently. So what we did is counting the number of tweets for each date, the results are as follows:

Fig 5. Tweet’s Date Distribution

From the plot above, you can clearly see that on 20th July 2022, the number of tweets regarding electric vehicle spiked compared to other dates. After a little bit of googling, we found out that there was an event called GIIAS where electric vehicles were showcased, which was held on 20th July 2022. The sudden increase of tweets presumably happened due to people were following the hype of GIIAS and opening discussions about it.

Sentiment Analysis

In this article, we also want to find out the sentiment of users’ tweets talking about the electrical vehicle. By using Cardiff Sentiment Analysis, we found out several pieces of information.

Most Discussed Topic

In this part we use Latent Dirichlet Allocation (LDA) to help define topics among our data. Using this model we will get several clusters where each cluster has its own top keywords often to be mentioned that can distinguish one another. In LDA there’s an adjustable parameter called lambda 𝛌 to define the relevance. This parameter is in the range of 0 to 1. If 𝛌 chosen is close to 1, it will give us the result based on the highest ratio of frequency of the terms for selected topic and the overall frequency. While 𝛌 close to 0 will give us the result that are more specific for a selected topic. However, the original paper authors kept lambda in the range of 0.3 to 0.6 (Shirley and Sievert, 2014). So in this analysis we will choose 0.4 as the lambda.

Using LDA Topic Modeling, we divide the Tweets into 3 clusters. We also tried to match them to the original tweets to get a better understanding of what each cluster is referring to.

For non blurry image, please refer to the github attached at the end of the article

Most Discussed Topic in Positive & Negative Tweet

We also try to find out what the topic has been talking about that has positive and negative sentiment separately. We have a hypothesis that there’s a different topic for tweets with positive and negative sentiment. We cluster these tweets into 3 groups for each sentiment.

Left: topic with positive sentiment. Right: topic with negative sentiment. *For non blurry image, please refer to the github attached at the end of the article

Based on the visualization chart above, some of the most topics discussed in positive & negative tweets are slightly similar. It can indicate that for these topics discussed on twitter, each has its pros and cons. These three topics are overall talking about:

  • Tweets about protesting the restrictions on electric vehicle
  • Tweets about the assumption of an expensive electric car price and an invitation to banks to break the assumption.

While there is another topic that mostly discussed with negative sentiment : Tweets about fuel subsidy that tend to be like a money-burning mechanism

And a topic mostly discussed with positive sentiment : Tweets about welcoming to banking commitments in supporting electric cars in Indonesia.

Executive Summary & Recommendation

After perusing and analyzing the copious tweets, we found some insights regarding the electric vehicle in Indonesia:

  1. The most popular type of electric vehicle, gauged from the number they were mentioned, are electric cars, followed by electric bikes and electric motorcycles.
  2. Other than the electric vehicle discussion itself, the one thing that was also heavily discussed is the prohibition of electric vehicles, further proven by the most mentioned keyword excluding the electric vehicle-related is “larang” (forbid/ban).
  3. Nearing the GIIAS event, the wave of tweets increased notably corresponding to the hype.
  4. Naturally, there are neutral, positive and negative sentiments toward electric vehicle in Indonesia. In general, each discussions could be segmented such as the following:
  • Fuel subsidy and public opinion to prepare electric car infrastructure
  • PLN’s support for the acceleration of the electric vehicle ecosystem
  • Protest and complaints about the prohibition on electric vehicles and the expensive prerequisite for owning an electric vehicle.

In summary, if we were to connect the dots, electric vehicles have a pretty promising future in Indonesia. The society have a quite strong awareness for this, they’re showing ideas and concern to make sure that Indonesia is well-prepared for electric vehicles, and there’s also the hype for GIIAS, showing that the society have interest in electric vehicles. In addition, the national electricity company (PLN) is also giving their full support to expedite Indonesia’s ecosystem for electric vehicles.

But in contrast, it also undeniably true that there are people that are deeply concern for the prerequisite for owning an electric vehicle. The maintenance, fuel, and even the price of the vehicle itself is far from cheap. Not to mention that the prohibitions of electric vehicles in several regions, making the society can’t help but ponder “Is Indonesia truly ready for the transportation advancement: electric vehicle?”

Therefore, from our analysis, we have a few recommendations for our readers (hopefully this reaches the officials!) for further improvement:

  1. Conduct a deep dive analysis on Indonesia’s financial circumstances, from both the government’s and society’s perspective. Will our infrastructure be robust enough to cater the needs of electric vehicles? Will the electric vehicles be affordable enough for the society? This needs to be analyzed rigorously to make sure that Indonesia is prepared for electric vehicles.
  2. For the government to engage with the society on a more personal level to receive ideas, inputs and concern that might help the process of electric vehicle preparation in Indonesia. There are several methods such as sharing a survey, communal website or digital platform to gather public opinions or even using social media data such as Twitter data, similar to what we’re doing in this analysis.

It is our fondest hope that Indonesia would improve continuously, especially in transportation. Electric Vehicle would most definitely help us improve Indonesia’s mobility, but at the same time minimizing the environmental expense such as CO2 or fuel gases. We believe that Indonesia’s finest days are not behind us, in fact it’s just around the corner, but we have to be ready to grasp the opportunity that will show itself in the foreseeable future. We pray this analysis is going to help whoever reads it as a trigger or initial insights for further research.

That’s pretty much sums up our analysis on this matter, thank you for reading!

--

--

Agha Maretha
Agha Maretha

Written by Agha Maretha

Learning through Life's Labyrinth

No responses yet