Week 6

Here are graphs of the terrorists attacks in Iraq and USA from 1981 to 2017. One interesting observation is that some of the terrorist attacks aren’t directly in the country the dataset states it took place in. For example, there are a couple of terrorist attacks physically in China, but state the attack was directed towards the US. It is still being determined if these data points will be included in the data.
Again, it seems that terrorist attacks were less common in the 1900’s compared to the 2000’s so background information needs to be found about exactly why. Another observation I quickly saw was that a significant amount of terrorists attacks in Italy were from 1900’s. The opposite is true for Yemen and neighboring countries. These are also possible options for analysis.

Week 5

For this week, I wanted to identity my focus for my analysis on the GTD dataset. This most likely will be changed in the future, but this will serve as a good start.

USA

  • Determine if a relationship exists between population and number of attacks. Unsure how to statistically find this relationship yet (linear / logistic regression?).
    • Gather population of city for year of attack through Population data first
    • Map the cities and use marker(s) to denote population number and/or number of attacks
  • Determine if a relationship exists between population and severity of attacks. Unsure how to statistically find this relationship yet (linear / logistic regression?).
    • Gather population of city for year of attack through Population data first
    • Map the cities and use marker(s) to denote population number and/or severity of attack
    • Define what severity of attack is (deaths, injured, destruction, etc)
  • Form clusters to see areas where attacks happen
    • Through mapping the cities, it seems there are patches of the USA where attacks haven’t occurred. Possibly research background info into why.

Iraq

  • Study the difference in number of attacks for the years 1970-2017
    • Research background info on why number of attacks increased substantially in the 2000’s
  • Study severity of attacks from 1900’s and 2000’s
    • Is there a difference? If so, research possibilities on why
  • Find where terrorists groups in the country live
    • Correlation between terrorists location and number / severity of attacks is possible
  • Form clusters to see where attacks happen
    • Through mapping the attacks, there seems to be patches where attacks haven’t occurred. Possibly research background info into why.
  • If find population info,  then do same population analysis as for USA

Week 4

The data I’m exploring is the Global Terrorism Database (GTD), which can be found here: https://www.kaggle.com/START-UMD/gtd. So far, I have used Tableau to map all terrorist attacks with the latitude and longitude information and color coded them by year. I also made a bar graph of the number of terrorist attacks by country and by year.

One interesting observation I’ve made so far is that although Iraq by far has the most terrorist attacks, a significant amount (I’m approximating 95% or more of them) occurred on or after 2003, even though the data encompasses terrorist attacks from 1970 to 2017. I thinking of exploring why with background information in the future.

I began trying to see if population influences the number or severity of terrorist acts, but difficulties have arise. Since GTD doesn’t include population information, I used a different dataset to get the population of each US city from 2010 to 2019 (https://www.census.gov/data/datasets/time-series/demo/popest/2010s-total-cities-and-towns.html#tables, Incorporated Places: 2010 to 2019 United States Dataset). I made revised versions of GTD dataset to only include information from the US from 2010 to 2017. I have also already made revised datasets so that the two can form a “relationship” with the city information, but I’m unable to make a relationship with the year information. This has caused me to be unable to easily integrate the data together so that I can get the population information for the city and year when a terrorist attack occurred.

I haven’t been able to find a simple solution to this problem besides through coding. I haven’t coded this yet, since the code could be complicated and I’m unsure what language to use, but I do have an idea of what to do to get this information without manually inputing it. Below is pseudocode for what I think a solution could be:

create new column in Terrorist US Data called Population
for all rows in Terrorist US Data (i = 1:n)
for all rows in Population US Data
if city names from both datasets match
capture row value in Population US Data where cities match (r)
stop for loop
else
go to next iteration / row in Population US Data
for certain columns (ones for year) in Population US Data
if years from both datasets match
capture column value in Population US Data where years match (c)
stop for loop
else
go to next iteration / column in Population US Data
Terrorist US Data [i, Population] = Population US Data [r,c]

If you have any advice or suggestions, I would appreciate it if you let me know.

Thank you!

Week 3

This week, I will be receiving the dataset I’ll be working on for the rest of the semester. I’m not sure what topics will pertain to the data, but I’m looking forward to finding background information about the data and experimenting upon it!