61 Commits

Author SHA1 Message Date
Michael Beck
01e58b1b99 adds html files to gitignore 2023-08-31 01:21:31 +02:00
Michael Beck
d0fcefedf4 data/OUT/profiles/CovTweets.html gelöscht 2023-08-31 01:20:39 +02:00
Michael Beck
71cf907249 data/OUT/profiles/AllTweets.html gelöscht 2023-08-31 01:20:31 +02:00
Michael Beck
a9018fedee REALLY corrects the filetree 2023-08-30 21:54:13 +02:00
Michael Beck
d94a93295f corrects filetree 2023-08-30 21:53:05 +02:00
Michael Beck
80b63b39df adds readme 0.2.0 2023-08-30 21:45:38 +02:00
Michael Beck
d8136909c8 corrects import of own functions that didn't work anymore because of a newer python version. 2023-08-30 21:45:27 +02:00
Michael Beck
1c6d9d5415 cleans and renames files 2023-08-30 21:18:55 +02:00
Michael Beck
4e08cde317 finishes classification scripts 2023-08-16 10:06:16 +02:00
Michael Beck
2535683cdc finishes classification scripts 2023-08-15 14:51:28 +02:00
Michael Beck
8f744a08be adds final counter keywords 2023-08-15 14:30:40 +02:00
Michael Beck
df5fd51a5f repairs stupid 2023-08-15 14:30:13 +02:00
Michael Beck
3d4f559d2d adds model training stats 2023-08-15 14:29:42 +02:00
Michael Beck
2e067b6a64 adds both classification scripts. Corrects inclusion of CleanTweets functions. 2023-08-15 14:23:56 +02:00
Michael Beck
7a16526a97 adds dataset profiles 2023-08-15 14:20:13 +02:00
Michael Beck
b89b5969ec adds typerror controls 2023-08-15 14:19:33 +02:00
Michael Beck
7c6b618272 adds both training scripts and evaluation files of topic classification 2023-08-15 14:19:08 +02:00
Michael Beck
90aa58239c adds generation of model-training dataset 2023-08-14 15:37:30 +02:00
Michael Beck
1beff96ae9 adds model training code 2023-08-14 15:37:05 +02:00
Michael Beck
881d3d6d6d adds tweet-text-cleaning functions 2023-08-14 15:36:46 +02:00
Michael Beck
5a63c478e9 adds dataset profiler 2023-08-08 15:32:12 +02:00
Michael Beck
ed61d52182 adds files to gitignore 2023-08-08 00:07:42 +02:00
Michael Beck
a26d150060 renames pretest classification file 2023-08-08 00:06:18 +02:00
Michael Beck
d791e4a293 adds classification file. adds removal of empty tweets after transormation for classification preparation 2023-08-08 00:04:14 +02:00
Michael Beck
d57b7a31b7 adds more counter keywords 2023-08-08 00:03:30 +02:00
Michael Beck
13d80124d3 adds lines with counterKeywords to remove non-covid tweets 2023-08-07 23:45:11 +02:00
Michael Beck
3de6d8f3ec adds tweetLen column, converts keywords to lowercase and removes certain keywords 2023-08-07 23:07:29 +02:00
Michael Beck
899a99ba72 adds CleanTweets functions, creates Graphs 2023-07-07 18:18:51 +02:00
Michael Beck
817ec48478 corrects a lot of mistakes.
adds keywords
adds analyze.py
adds pretest
adds pretest ids
2023-07-07 00:16:44 +02:00
Michael Beck
c64904a64d adds cleanTweets.py 2023-06-26 23:51:32 +02:00
Michael Beck
82830f13e2 „README.md“ ändern 2023-06-26 13:12:16 +02:00
Michael Beck
8c8a191952 „README.md“ hinzufügen 2023-06-26 13:12:04 +02:00
Michael Beck
71e10a62d3 adds senator data scraper 2023-06-23 23:53:31 +02:00
Michael Beck
90d5501ec8 adds comment 2023-06-23 23:53:01 +02:00
Michael Beck
340cca017c corrects comments 0.1.5 2023-06-23 20:59:14 +02:00
Michael Beck
791cebc297 adds log folder 2023-06-23 20:49:35 +02:00
Michael Beck
6241484e83 adds gitkeep 2023-06-23 20:47:32 +02:00
Michael Beck
d73da8db98 Merge remote-tracking branch 'origin/master' 2023-06-23 20:42:58 +02:00
Michael Beck
6220c1841d „collect.ipynb“ löschen 2023-06-23 20:41:56 +02:00
Michael Beck
27746cd886 changes folder structure of in- and output files 0.1.2 2023-06-23 20:39:40 +02:00
Michael Beck
02c3d055bd adds comments. changes logfile format to .log 0.1.1 2023-06-23 20:34:46 +02:00
Michael Beck
dc2e17cc2f adds docstrings to functions. adds several comments. 2023-06-23 20:26:16 +02:00
Michael Beck
e8ba02ca0f fixes multiprocessing. 0.1.0 2023-06-23 19:18:03 +02:00
Michael Beck
b00f75e9fe corrects some mistakes 2023-06-23 18:09:09 +02:00
Michael Beck
1b43b295ce adds filechecks 2023-06-23 17:47:23 +02:00
Michael Beck
fb7a70cf66 adds missing file report 2023-06-23 17:04:08 +02:00
Michael Beck
1a19fd407a adds alt_accounts check and removes NANs from alt_accounts. Prints accounts to output more beautifully. 2023-06-23 16:54:57 +02:00
Michael Beck
5d0c41407e adds multiprocessing to scrape tweets. 2023-06-23 16:41:20 +02:00
Michael Beck
c675db9d00 adds python and lockfiles to gitignore 2023-06-23 15:59:29 +02:00
Michael Beck
88c016a2a6 adds 2023-06-23 15:57:31 +02:00