The Zapping Insights: Indonesia’s Electric Vehicle

Traffic jam on Jl. Sudirman, Jakarta, on March 16, 2020.(Antara/Galih Pradipta)


  1. We will be using the Twitter API to scrape the data from Twitter. This API can extract data for up to a week.
  2. Data that we’ll be using in this analysis are from July 15th until July 22nd, 2022. In this period, overall there are 3,983 unique Information that we can processed further
  3. Since we want to know the sentiment of Indonesian Society toward Vehicle Electric. The data that we’ll be extracting are tweets that are coming from Indonesian User’s
  4. Data that is extracted from Twitter are tweets that contain the following words :
  • Mobil Listrik
  • Motor Listrik
  • Kendaraan Listrik
  • Sepeda Listrik
  • Electric Vehicle
  • Twitter Username
  • User Tweet
  • Timestamp of the Tweet
  • Label is Verified Account (True if the Account is an Official Account)


  1. There’s a certain Electric Vehicle that is more popular among the Indonesia User’s
  2. There’s a Specific Topic among Twitter User’s that is Trending during July 15th — July 22nd, 2022

Data Extraction

The data we collected are data that’s extracted directly from Twitter during a certain period. To extract the data from Twitter we need to create a twitter account and register as a twitter developer on the developer website. This step is essential so we could get the API Key and Token. For more on this you follow the step by step that have been created by Ahmed Besbes in this link. After setting up the account and API Key and Token we then could extract the data using the following script :

import tweepy#Write down the Key ( key and token below are for example purpose)api_key = 'your_api_key'
api_key_secret = 'your_api_key_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'
auth = tweepy.OAuthHandler(api_key, api_key_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
#for search specific keyword/hashtag
keyword = 'kendaraan listrik'
#define extraction date (can be up to a week)
start_date = '2022-07-15'
end_date = '2022-07-22'
search_tweets = tweepy.Cursor(, q=keyword, lang = "id", since=start_date, until=end_date, count=100, tweet_mode='extended').items(limit)
data = []
columns = ['Time', 'User', 'Tweet', 'is_verified_acc']
for tweet in search_tweets :
data.append([tweet.created_at, tweet.user.screen_name, tweet.full_text, tweet.user.verified])
df = pd.DataFrame(data, columns = columns)

Data Preprocessing

Data Infos

We collected tweets from July 15th, 2022 until July 22nd, 2022 specifically in Indonesia that contains the words mentioned before. There are a total of 3983 rows with 4 columns including the tweet timestamp, username, tweet and an indicator whether the user is a verified account or not.

Fig 1. Example of the tweets

Pre-processing Steps

Tweets are basically texts, and there are certain delicates when handling texts to make sure the data we’re using for the analysis is tidy enough and easy to understand. Since there are limitless number of ways people on twitter could form their tweets (i.e. using emoji, certain punctuation marks, combination of words, etc.), we pre-processed the data with the steps listed below:


Tweet Length Distribution

The manner in which the Twitter users discussing electric vehicles may vary, one simple way we could see this is through the length of the tweet. If it’s a short tweet, then it might suggest that the topic is being discussed in a conversation (replies and mentions) or a brief statement. If it’s a moderate to long tweet, then it might suggest that the topic is being discussed in a thread or in a long statement. Let’s take a look

Fig 2. Tweet Length Distribution
  1. Length of 1 to 10
  2. Length of 11 to 14
  3. Length of 15 to 17
  4. Length of 18 and above
Fig 3. Wordcloud for Tweet Length Distribution

Trending Keyword

Using our dataset, we are looking for keywords that are often to be mentioned when people were talking about electrical vehicles. From the word cloud below, we found out that from our keyword specified before, the order of the electric vehicles that are most often mentioned (other than listrik word) are: mobil, kendaraan, sepeda, followed by motor.

Fig 4. Wordcloud for all words on all tweets
Fig 5. Wordcloud for non electrical vehicle type keywords

Events Related with the Tweets

For additional EDA, we tried to see if there’s a certain time that the tweets behaving differently. So what we did is counting the number of tweets for each date, the results are as follows:

Fig 5. Tweet’s Date Distribution

Sentiment Analysis

In this article, we also want to find out the sentiment of users’ tweets talking about the electrical vehicle. By using Cardiff Sentiment Analysis, we found out several pieces of information.

Most Discussed Topic

In this part we use Latent Dirichlet Allocation (LDA) to help define topics among our data. Using this model we will get several clusters where each cluster has its own top keywords often to be mentioned that can distinguish one another. In LDA there’s an adjustable parameter called lambda 𝛌 to define the relevance. This parameter is in the range of 0 to 1. If 𝛌 chosen is close to 1, it will give us the result based on the highest ratio of frequency of the terms for selected topic and the overall frequency. While 𝛌 close to 0 will give us the result that are more specific for a selected topic. However, the original paper authors kept lambda in the range of 0.3 to 0.6 (Shirley and Sievert, 2014). So in this analysis we will choose 0.4 as the lambda.

For non blurry image, please refer to the github attached at the end of the article

Most Discussed Topic in Positive & Negative Tweet

We also try to find out what the topic has been talking about that has positive and negative sentiment separately. We have a hypothesis that there’s a different topic for tweets with positive and negative sentiment. We cluster these tweets into 3 groups for each sentiment.

Left: topic with positive sentiment. Right: topic with negative sentiment. *For non blurry image, please refer to the github attached at the end of the article
  • Tweets about protesting the restrictions on electric vehicle
  • Tweets about the assumption of an expensive electric car price and an invitation to banks to break the assumption.

Executive Summary & Recommendation

After perusing and analyzing the copious tweets, we found some insights regarding the electric vehicle in Indonesia:

  1. The most popular type of electric vehicle, gauged from the number they were mentioned, are electric cars, followed by electric bikes and electric motorcycles.
  2. Other than the electric vehicle discussion itself, the one thing that was also heavily discussed is the prohibition of electric vehicles, further proven by the most mentioned keyword excluding the electric vehicle-related is “larang” (forbid/ban).
  3. Nearing the GIIAS event, the wave of tweets increased notably corresponding to the hype.
  4. Naturally, there are neutral, positive and negative sentiments toward electric vehicle in Indonesia. In general, each discussions could be segmented such as the following:
  • Fuel subsidy and public opinion to prepare electric car infrastructure
  • PLN’s support for the acceleration of the electric vehicle ecosystem
  • Protest and complaints about the prohibition on electric vehicles and the expensive prerequisite for owning an electric vehicle.
  1. Conduct a deep dive analysis on Indonesia’s financial circumstances, from both the government’s and society’s perspective. Will our infrastructure be robust enough to cater the needs of electric vehicles? Will the electric vehicles be affordable enough for the society? This needs to be analyzed rigorously to make sure that Indonesia is prepared for electric vehicles.
  2. For the government to engage with the society on a more personal level to receive ideas, inputs and concern that might help the process of electric vehicle preparation in Indonesia. There are several methods such as sharing a survey, communal website or digital platform to gather public opinions or even using social media data such as Twitter data, similar to what we’re doing in this analysis.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store