649 lines
26 KiB
Text
649 lines
26 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "83885e86-1ccb-46ec-bee9-a33f3b541569",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Zusammenfassung der Analysen vom Hackathon für die Webside\n",
|
||
"\n",
|
||
"- womöglich zur Darstellung auf der Webside\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"id": "9bd1686f-9bbc-4c05-a5f5-e0c4ce653fb2",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"import pandas as pd\n",
|
||
"import altair as alt"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "81780c9a-7721-438b-9726-ff5a70910ce8",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Daten aufbereitung\n",
|
||
"\n",
|
||
"Dump der Datenbank vom 25.03.2023. Die verschiedene Tabellen der Datenbank werden einzeln eingelesen. Zusätzlich werden alle direkt zu einem Tweet zugehörige Information in ein Datenobjekt gesammelt. Die Informationen zu den GIS-Daten zu den einzelnen Polizeistadtion (\"police_stations\") sind noch unvollständig und müssen gegebenfalls nocheinmal überprüft werden.\n",
|
||
"\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "e312a975-3921-44ee-a7c5-37736678bc3f",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>tweet_id</th>\n",
|
||
" <th>measured_at</th>\n",
|
||
" <th>like_count</th>\n",
|
||
" <th>reply_count</th>\n",
|
||
" <th>retweet_count</th>\n",
|
||
" <th>quote_count</th>\n",
|
||
" <th>is_deleted</th>\n",
|
||
" <th>tweet_text</th>\n",
|
||
" <th>created_at</th>\n",
|
||
" <th>user_id</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>1496955054712045581</td>\n",
|
||
" <td>2022-02-28 22:42:26</td>\n",
|
||
" <td>13</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Auch wir schließen uns dem Apell an! \\n\\n#Ukra...</td>\n",
|
||
" <td>2022-02-24 21:07:51</td>\n",
|
||
" <td>773438463068766208</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>1496957213516214277</td>\n",
|
||
" <td>2022-02-28 22:42:26</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>@BWeltenbummler Sehr schwer zu sagen. Die Evak...</td>\n",
|
||
" <td>2022-02-24 21:16:26</td>\n",
|
||
" <td>2397974054</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>1496963501201539073</td>\n",
|
||
" <td>2022-02-28 22:42:26</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>Halten Sie durch – die Evakuierung ist fast ab...</td>\n",
|
||
" <td>2022-02-24 21:41:25</td>\n",
|
||
" <td>2398002414</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>1496963771054825472</td>\n",
|
||
" <td>2022-02-28 23:42:27</td>\n",
|
||
" <td>142</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>Halten Sie durch – die Evakuierung ist fast ab...</td>\n",
|
||
" <td>2022-02-24 21:42:29</td>\n",
|
||
" <td>2398002414</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>1496965696907104258</td>\n",
|
||
" <td>2022-02-28 23:42:27</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>11</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>RT @drkberlin_iuk: 🚨 In enger Abstimmung mit d...</td>\n",
|
||
" <td>2022-02-24 21:50:09</td>\n",
|
||
" <td>2398002414</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" tweet_id measured_at like_count reply_count \\\n",
|
||
"0 1496955054712045581 2022-02-28 22:42:26 13 0 \n",
|
||
"1 1496957213516214277 2022-02-28 22:42:26 2 0 \n",
|
||
"2 1496963501201539073 2022-02-28 22:42:26 -1 -1 \n",
|
||
"3 1496963771054825472 2022-02-28 23:42:27 142 7 \n",
|
||
"4 1496965696907104258 2022-02-28 23:42:27 0 0 \n",
|
||
"\n",
|
||
" retweet_count quote_count is_deleted \\\n",
|
||
"0 2 0 0 \n",
|
||
"1 0 0 0 \n",
|
||
"2 -1 -1 1 \n",
|
||
"3 8 3 0 \n",
|
||
"4 11 0 0 \n",
|
||
"\n",
|
||
" tweet_text created_at \\\n",
|
||
"0 Auch wir schließen uns dem Apell an! \\n\\n#Ukra... 2022-02-24 21:07:51 \n",
|
||
"1 @BWeltenbummler Sehr schwer zu sagen. Die Evak... 2022-02-24 21:16:26 \n",
|
||
"2 Halten Sie durch – die Evakuierung ist fast ab... 2022-02-24 21:41:25 \n",
|
||
"3 Halten Sie durch – die Evakuierung ist fast ab... 2022-02-24 21:42:29 \n",
|
||
"4 RT @drkberlin_iuk: 🚨 In enger Abstimmung mit d... 2022-02-24 21:50:09 \n",
|
||
"\n",
|
||
" user_id \n",
|
||
"0 773438463068766208 \n",
|
||
"1 2397974054 \n",
|
||
"2 2398002414 \n",
|
||
"3 2398002414 \n",
|
||
"4 2398002414 "
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"tweets_meta = pd.read_csv(\"data/tweets.csv\")\n",
|
||
"tweets_time = pd.read_csv(\"data/tweets-1679742620302.csv\")\n",
|
||
"tweets_text = pd.read_csv(\"data/tweets-1679742698645.csv\")\n",
|
||
"tweets_user = pd.read_csv(\"data/tweets-1679742702794.csv\"\n",
|
||
" ).rename(columns = {\"username\":\"handle\", # rename columns\n",
|
||
" \"handle\": \"username\"})\n",
|
||
"tweets_user = tweets_user.assign(handle = tweets_user['handle'].str.lower()) # convert handles to lower case\n",
|
||
"tweets_combined = pd.merge(tweets_time, # merge the two tweet related data frames\n",
|
||
" tweets_text, \n",
|
||
" how = 'inner', \n",
|
||
" on = 'tweet_id'\n",
|
||
" ).drop(['id'], # drop unascessary id column (redundant to index)\n",
|
||
" axis = 1)\n",
|
||
"tweets_combined = tweets_combined.assign(measured_at = pd.to_datetime(tweets_combined['measured_at']), # change date to date format\n",
|
||
" created_at = pd.to_datetime(tweets_combined['created_at']))\n",
|
||
"police_stations = pd.read_csv(\"data/polizei_accounts_geo.csv\", sep = \"\\t\" # addiditional on police stations\n",
|
||
" ).rename(columns = {\"Polizei Account\": \"handle\"})\n",
|
||
"tweets_combined.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "91dfb8bb-15dc-4b2c-9c5f-3eab18d78ef8",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"### Adjazenzmatrix mentions\n",
|
||
" \n",
|
||
"Information, welche nicht direkt enthalten ist: welche Accounts werden erwähnt. Ist nur im Tweet mit @handle gekennzeichnet."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 53,
|
||
"id": "5d8bf730-3c8f-4143-b405-c95f1914f54b",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 Auch wir schließen uns dem Apell an! \\n\\n#Ukra...\n",
|
||
"1 @BWeltenbummler Sehr schwer zu sagen. Die Evak...\n",
|
||
"2 Halten Sie durch – die Evakuierung ist fast ab...\n",
|
||
"3 Halten Sie durch – die Evakuierung ist fast ab...\n",
|
||
"4 RT @drkberlin_iuk: 🚨 In enger Abstimmung mit d...\n",
|
||
"Name: tweet_text, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 53,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# TODO"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0c242090-0748-488c-b604-f521030f468f",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"## Metadaten \n",
|
||
"\n",
|
||
"Welche Daten bilden die Grundlage?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "0e5eb455-6b12-4572-8f5e-f328a94bd797",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"hashtag 157145\n",
|
||
"url 88322\n",
|
||
"mention 36815\n",
|
||
"Name: entity_type, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"tweets_meta[\"entity_type\"].value_counts()\n",
|
||
"# tweets_meta[tweets_meta['entity_type'] == \"mention\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ef440301-cf89-4e80-8801-eb853d636190",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"Insgesamt haben wir 84794 einzigartige Tweets:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "5a438e7f-8735-40bb-b450-2ce168f0f67a",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"84794"
|
||
]
|
||
},
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"tweets_combined[\"tweet_id\"].value_counts().shape[0] # Anzahl an Tweets"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"id": "4f1e8c6c-3610-436e-899e-4d0307259230",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Die Tweets wurden vom 2022-02-24 bis zum: 2023-03-16 gesammelt. Also genau insgesamt: 384 Tage.\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"Die Tweets wurden vom \", tweets_combined['created_at'].min().date(), \"bis zum:\", tweets_combined['created_at'].max().date(), \"gesammelt.\", \"Also genau insgesamt:\", (tweets_combined['created_at'].max() - tweets_combined['created_at'].min()).days, \"Tage.\")\n",
|
||
"# tweets_combined[tweets_combined['created_at'] == tweets_combined['created_at'].max()] # Tweets vom letzten Tag"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d8b47a60-1535-4d03-913a-73e897bc18df",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"Welche Polizei Accounts haben am meisten getweetet?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 43,
|
||
"id": "9373552e-6baf-46df-ae16-c63603e20a83",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>handle</th>\n",
|
||
" <th>count</th>\n",
|
||
" <th>Name</th>\n",
|
||
" <th>Typ</th>\n",
|
||
" <th>Bundesland</th>\n",
|
||
" <th>Stadt</th>\n",
|
||
" <th>LAT</th>\n",
|
||
" <th>LONG</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>polizei_ffm</td>\n",
|
||
" <td>2993</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>polizei_nrw_do</td>\n",
|
||
" <td>2860</td>\n",
|
||
" <td>Polizei NRW DO</td>\n",
|
||
" <td>Polizei</td>\n",
|
||
" <td>Nordrhein-Westfalen</td>\n",
|
||
" <td>Dortmund</td>\n",
|
||
" <td>51.5142273</td>\n",
|
||
" <td>7.4652789</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>polizeisachsen</td>\n",
|
||
" <td>2700</td>\n",
|
||
" <td>Polizei Sachsen</td>\n",
|
||
" <td>Polizei</td>\n",
|
||
" <td>Sachsen</td>\n",
|
||
" <td>Dresden</td>\n",
|
||
" <td>51.0493286</td>\n",
|
||
" <td>13.7381437</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>91</th>\n",
|
||
" <td>polizeibb</td>\n",
|
||
" <td>2310</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>61</th>\n",
|
||
" <td>polizeihamburg</td>\n",
|
||
" <td>2093</td>\n",
|
||
" <td>Polizei Hamburg</td>\n",
|
||
" <td>Polizei</td>\n",
|
||
" <td>Hamburg</td>\n",
|
||
" <td>Hamburg</td>\n",
|
||
" <td>53.550341</td>\n",
|
||
" <td>10.000654</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" handle count Name Typ Bundesland \\\n",
|
||
"11 polizei_ffm 2993 NaN NaN NaN \n",
|
||
"3 polizei_nrw_do 2860 Polizei NRW DO Polizei Nordrhein-Westfalen \n",
|
||
"0 polizeisachsen 2700 Polizei Sachsen Polizei Sachsen \n",
|
||
"91 polizeibb 2310 NaN NaN NaN \n",
|
||
"61 polizeihamburg 2093 Polizei Hamburg Polizei Hamburg \n",
|
||
"\n",
|
||
" Stadt LAT LONG \n",
|
||
"11 NaN NaN NaN \n",
|
||
"3 Dortmund 51.5142273 7.4652789 \n",
|
||
"0 Dresden 51.0493286 13.7381437 \n",
|
||
"91 NaN NaN NaN \n",
|
||
"61 Hamburg 53.550341 10.000654 "
|
||
]
|
||
},
|
||
"execution_count": 43,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"tweets_agg = tweets_combined.merge(tweets_user,\n",
|
||
" on = \"user_id\"\n",
|
||
" ).groupby(by = [\"user_id\", \"handle\", \"username\"]\n",
|
||
" )[\"user_id\"].aggregate(['count']\n",
|
||
" ).merge(police_stations, \n",
|
||
" on = \"handle\",\n",
|
||
" how = \"left\"\n",
|
||
" ).sort_values(['count'], \n",
|
||
" ascending=False)\n",
|
||
"tweets_agg.shape\n",
|
||
"activy_police_vis = tweets_agg[0:50]\n",
|
||
"activy_police_vis.headd()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9cf5f544-706b-41af-b785-7023f04e3ecb",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"Visualisierung aktivste Polizeistadtionen:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 47,
|
||
"id": "b1c39196-d1cc-4f82-8e01-7529e7b3046f",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"\n",
|
||
"<div id=\"altair-viz-a660bd38b72240eaae654b5e471932a6\"></div>\n",
|
||
"<script type=\"text/javascript\">\n",
|
||
" var VEGA_DEBUG = (typeof VEGA_DEBUG == \"undefined\") ? {} : VEGA_DEBUG;\n",
|
||
" (function(spec, embedOpt){\n",
|
||
" let outputDiv = document.currentScript.previousElementSibling;\n",
|
||
" if (outputDiv.id !== \"altair-viz-a660bd38b72240eaae654b5e471932a6\") {\n",
|
||
" outputDiv = document.getElementById(\"altair-viz-a660bd38b72240eaae654b5e471932a6\");\n",
|
||
" }\n",
|
||
" const paths = {\n",
|
||
" \"vega\": \"https://cdn.jsdelivr.net/npm//vega@5?noext\",\n",
|
||
" \"vega-lib\": \"https://cdn.jsdelivr.net/npm//vega-lib?noext\",\n",
|
||
" \"vega-lite\": \"https://cdn.jsdelivr.net/npm//vega-lite@4.17.0?noext\",\n",
|
||
" \"vega-embed\": \"https://cdn.jsdelivr.net/npm//vega-embed@6?noext\",\n",
|
||
" };\n",
|
||
"\n",
|
||
" function maybeLoadScript(lib, version) {\n",
|
||
" var key = `${lib.replace(\"-\", \"\")}_version`;\n",
|
||
" return (VEGA_DEBUG[key] == version) ?\n",
|
||
" Promise.resolve(paths[lib]) :\n",
|
||
" new Promise(function(resolve, reject) {\n",
|
||
" var s = document.createElement('script');\n",
|
||
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
|
||
" s.async = true;\n",
|
||
" s.onload = () => {\n",
|
||
" VEGA_DEBUG[key] = version;\n",
|
||
" return resolve(paths[lib]);\n",
|
||
" };\n",
|
||
" s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
|
||
" s.src = paths[lib];\n",
|
||
" });\n",
|
||
" }\n",
|
||
"\n",
|
||
" function showError(err) {\n",
|
||
" outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
|
||
" throw err;\n",
|
||
" }\n",
|
||
"\n",
|
||
" function displayChart(vegaEmbed) {\n",
|
||
" vegaEmbed(outputDiv, spec, embedOpt)\n",
|
||
" .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
|
||
" }\n",
|
||
"\n",
|
||
" if(typeof define === \"function\" && define.amd) {\n",
|
||
" requirejs.config({paths});\n",
|
||
" require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
|
||
" } else {\n",
|
||
" maybeLoadScript(\"vega\", \"5\")\n",
|
||
" .then(() => maybeLoadScript(\"vega-lite\", \"4.17.0\"))\n",
|
||
" .then(() => maybeLoadScript(\"vega-embed\", \"6\"))\n",
|
||
" .catch(showError)\n",
|
||
" .then(() => displayChart(vegaEmbed));\n",
|
||
" }\n",
|
||
" })({\"config\": {\"view\": {\"continuousWidth\": 400, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-da2bacd5b3a57271f77be4dc435a345f\"}, \"mark\": \"bar\", \"encoding\": {\"x\": {\"field\": \"count\", \"type\": \"quantitative\"}, \"y\": {\"field\": \"handle\", \"sort\": \"-x\", \"type\": \"nominal\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v4.17.0.json\", \"datasets\": {\"data-da2bacd5b3a57271f77be4dc435a345f\": [{\"handle\": \"polizei_ffm\", \"count\": 2993, \"Name\": null, \"Typ\": null, \"Bundesland\": null, \"Stadt\": null, \"LAT\": null, \"LONG\": null}, {\"handle\": \"polizei_nrw_do\", \"count\": 2860, \"Name\": \"Polizei NRW DO\", \"Typ\": \"Polizei\", \"Bundesland\": \"Nordrhein-Westfalen\", \"Stadt\": \"Dortmund\", \"LAT\": \"51.5142273\", \"LONG\": \"7.4652789\"}, {\"handle\": \"polizeisachsen\", \"count\": 2700, \"Name\": \"Polizei Sachsen\", \"Typ\": \"Polizei\", \"Bundesland\": \"Sachsen\", \"Stadt\": \"Dresden\", \"LAT\": \"51.0493286\", \"LONG\": \"13.7381437\"}, {\"handle\": \"polizeibb\", \"count\": 2310, \"Name\": null, \"Typ\": null, \"Bundesland\": null, \"Stadt\": null, \"LAT\": null, \"LONG\": null}, {\"handle\": \"polizeihamburg\", \"count\": 2093, \"Name\": \"Polizei Hamburg\", \"Typ\": \"Polizei\", \"Bundesland\": \"Hamburg\", \"Stadt\": \"Hamburg\", \"LAT\": \"53.550341\", \"LONG\": \"10.000654\"}, {\"handle\": \"polizeimuenchen\", \"count\": 2021, \"Name\": \"Polizei M\\u00fcnchen\", \"Typ\": \"Polizei\", \"Bundesland\": \"Bayern\", \"Stadt\": \"M\\u00fcnchen\", \"LAT\": \"48.135125\", \"LONG\": \"11.581981\"}, {\"handle\": \"polizeimfr\", \"count\": 1892, \"Name\": \"Polizei Mittelfranken\", \"Typ\": \"Polizei\", \"Bundesland\": \"Bayern\", \"Stadt\": \"N\\u00fcrnberg\", \"LAT\": \"49.453872\", \"LONG\": \"11.077298\"}, {\"handle\": \"polizeimannheim\", \"count\": 1835, \"Name\": \"Polizei Mannheim\", \"Typ\": \"Polizei\", \"Bundesland\": \"Baden-W\\u00fcrttemberg\", \"Stadt\": \"Mannheim\", \"LAT\": \"49.4892913\", \"LONG\": \"8.4673098\"}, {\"handle\": \"polizei_nrw_bi\", \"count\": 1794, \"Name\": \"Polizei NRW BI\", \"Typ\": \"Polizei\", \"Bundesland\": \"Nordrhein-Westfalen\", \"Stadt\": \"Bielefeld\", \"LAT\": \"52.0191005\", \"LONG\": \"8.531007\"}, {\"handle\": \"polizei_nrw_k\", \"count\": 1540, \"Name\": \"Polizei NRW K\", \"Typ\": \"Polizei\", \"Bundesland\": \"Nordrhein-Westfalen\", \"Stadt\": \"K\\u00f6ln\", \"LAT\": \"50.938361\", \"LONG\": \"6.959974\"}, {\"handle\": \"bremenpolizei\", \"count\": 1417, \"Name\": null, \"Typ\": null, \"Bundesland\": null, \"Stadt\": null, \"LAT\": null, \"LONG\": null}, {\"handle\": \"polizei_kl\", \"count\": 1380, \"Name\": \"Polizei Kaiserslautern\", \"Typ\": \"Polizei\", \"Bundesland\": \"Rheinland-Pfalz\", \"Stadt\": \"Kaiserslautern\", \"LAT\": \"49.4432174\", \"LONG\": \"7.7689951\"}, {\"handle\": \"polizei_md\", \"count\": 1365, \"Name\": \"Polizei Magdeburg\", \"Typ\": \"Polizei\", \"Bundesland\": \"Sachsen-Anhalt\", \"Stadt\": \"Magdeburg\", \"LAT\": \"52.1315889\", \"LONG\": \"11.6399609\"}, {\"handle\": \"polizei_ka\", \"count\": 1356, \"Name\": \"Polizei Karlsruhe\", \"Typ\": \"Polizei\", \"Bundesland\": \"Baden-W\\u00fcrttemberg\", \"Stadt\": \"Karlsruhe\", \"LAT\": \"49.0068705\", \"LONG\": \"8.4034195\"}, {\"handle\": \"polizeiberlin\", \"count\": 1351, \"Name\": null, \"Typ\": null, \"Bundesland\": null, \"Stadt\": null, \"LAT\": null, \"LONG\": null}]}}, {\"mode\": \"vega-lite\"});\n",
|
||
"</script>"
|
||
],
|
||
"text/plain": [
|
||
"alt.Chart(...)"
|
||
]
|
||
},
|
||
"execution_count": 47,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"barchart = alt.Chart(activy_police_vis[0:15]).mark_bar().encode(\n",
|
||
" x = 'count:Q',\n",
|
||
" y = alt.Y('handle:N', sort = '-x'),\n",
|
||
")\n",
|
||
"barchart "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "90f686ff-93c6-44d9-9761-feb35dfe9d1d",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"Welche Tweets ziehen besonders viel Aufmerksamkeit auf sich?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 90,
|
||
"id": "d0549250-b11f-4762-8500-1134c53303b4",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 Die Gewalt, die unsere Kolleginnen & Kollegen in der Silvesternacht erleben mussten, ist une...\n",
|
||
"1 WICHTIGE Info:\\nÜber das Internet wird derzeit ein Video verbreitet, in dem von einem Überfall a...\n",
|
||
"2 Die Experten gehen derzeit davon aus, dass es sich um ein absichtliches \"Fake-Video\" handelt, da...\n",
|
||
"3 Auf unserem #A45 in #lichterfelde) befindet sich gerade diese Fundhündin. Sie wurde am Hindenbur...\n",
|
||
"4 @nexta_tv Wir haben das Video gesichert und leiten den Sachverhalt an die zuständigen Kolleginne...\n",
|
||
" ... \n",
|
||
"84789 #Polizeimeldungen #Tagesticker\\n \\nAnhalt-Bitterfeld\\nhttps://t.co/tNLEzztL1o\\n \\nDessau-Roßlau\\...\n",
|
||
"84790 Am Mittwoch erhielten wir mehrere Anrufe über einen auffälligen Pkw-Fahrer (Reifen quietschen un...\n",
|
||
"84791 @Jonas5Luisa Kleiner Pro-Tipp von uns: Einfach mal auf den link klicken! ;)*cl\n",
|
||
"84792 Vermisstensuche nach 27-Jährigem aus Bendorf-Mühlhofen: Wer hat Tobias Wißmann gesehen? Ein Foto...\n",
|
||
"84793 #PolizeiNRW #Köln #Leverkusen : XXX - Infos unter https://t.co/SeWShP2tZE https://t.co/Kopy7w8W3B\n",
|
||
"Name: tweet_text, Length: 84794, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 90,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"tweets_attention = tweets_combined.merge(tweets_user,\n",
|
||
" on = \"user_id\",\n",
|
||
" how = \"left\"\n",
|
||
" ).merge(police_stations,\n",
|
||
" on = \"handle\",\n",
|
||
" how = \"left\")\n",
|
||
"pd.options.display.max_colwidth = 100\n",
|
||
"tweets_attention.sort_values('like_count', ascending = False).reset_index()['tweet_text']\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "python-scientific kernel",
|
||
"language": "python",
|
||
"name": "python-scientific"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.9"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|