Data Visualization Course

Data Visualization Protocol

The visualization protocol requires specifying the dataset’s timeframe and source, along with links to the dataset and metadata describing its columns. Additionally, it mandates documenting the steps and scripts used to process the raw data. This ensures transparency and traceability of the visualization process.

Data Source

Crimes in Boston
Crime incident reports are provided by Boston Police Department (BPD) to document the initial details surrounding an incident to which BPD officers respond. This is a dataset containing records from the new crime incident report system, which includes a reduced set of fields focused on capturing the type of incident as well as when and where it occurred.

Timestamp:

From 2015 to 2018

link to the dataset

2. Observation of the dataset:

  1. Upon reviewing the dataset, we decided to eliminate Offence_code column because we already have the meaning of each offence code.
  2. Additionally, we noticed that the column names were not formatted properly for the program’s needs. We therefore renamed all columns to ensure consistency and readability.

3. ​Data Cleaning:

  1. Removing Duplicates: we tried to remove duplicates, but we didn’t find any repeated values to drop.
  2. Handling Missing Values: we deleted the samples with missing values in the District feature.
  3. Replace values: we substituted the values “nan” and “Y” in the Shooting feature with FALSE and TRUE.

5. ​Filtered Dataset:

The filtered dataset is exported and named as crime_cleaned.csv 

Preprocessing for each visualization

Each visualization required specific data treatments and preprocessing steps. To streamline the process, we decided to use the same dataset as a base, applying column filters, feature engineering, and other preprocessing directly within Tableau. This approach ensured consistency while allowing flexibility to tailor the data for each specific visualization.

Visualisation n.1

Linechart: average number of crimes during the day

1. Create a calculate field where we obtain the average number of crimes in a day
2. Put the calculated field in the rows field
3. Put Hour in the columns
4. Change the type of chart to line chart
5. Go to Hour axis
6. Select Modify axis
7. Under Interval, select Personalize
8. Change the value -2 to 0 and the value 25 to 23
9. In Title, select Hide title

Visualisation n.2

Maps: type of crimes for district

1. Grab and drag the Geometry of Police_District.geojson file on the empty table
2. Put District1 in the label indicators
3. Put Number of crimes of crime_cleaned.csv file in color indicators
4. Put the Ucr Part in Filters
5. Select Modify Filter
6. Select one of three main options (High, medium, low)
7. In color indicators, select modify color
8. Select one of the three color: red(high), orange(medium), low(yellow)
9. In Title, select Hide title

Visualisation n.3

Barchart: level of crimes


1. Put District of crime_cleaned.csv file in the columns field
2. Put the calculate field Number of crimes in the rows field
3. Put Ucr part in Filters
4. Select Modify Filter
5. Select only High, Medium, Low values
6. Grab and drag Ucr Part on color indicators
7. In color indicators, select modify colors
8. Select red(High), orange(medium), yellow(low)
9. In Title, select Hide title

link to the documentation