Cyberspace of Shujun LI

Shortcuts

General

e-Data and Data Analytics Services: DQ Institute (Child Online Safety Index (COSI)) Research Data Alliance (RDA) Zenodo Figshare Dryad DataCite Wolfram Data Repository The GDELT Project: A Global Database of Society data.world Global Open Data Index (Open Knowledge) Sunlight Foundation data.europa.eu JRC (Joint Research Centre) Data Catalogue UK Data Service Smart Data Research UK DARE UK (Data and Analytics Research Environments UK) (Community Groups) ADR UK (Administrative Data Research UK) data.police.uk Police Data Initiative mldata (machine learning data set repository) MLcomp datasets UC Irvine Machine Learning Repository Kaggle KNIME Google Dataset Search Google Public Data Explorer Common Crawl Elicit: The AI Research Assistant People Data Labs (Free Company Dataset, Largest US Employers by Metro Dataset, Free Job Title Dataset, Free Engineering Skills Dataset) Awesome Public Datasets Network Repository: An Interactive Scientific Network Data Repository University Domains and Names Data List & API The Data City (The Innovation Clusters Map @ DSIT) WhatDoTheyKnow @ mySociety emotion icon 数据堂 emotion icon D3.js Graph Gallery The Data Visualisation Catalogue From Data to Viz informationisbeautiful.net Meltwater DataGenetics Informatica Corporation Splunk Inc. Tableau Software Open Measures (GitLab) Newspaper3k: Article scraping & curation Otter.ai Trint Scrintal emotion icon DBpedia (DBpedia Databus; DBpedia Archivo; DBpedia MARVIN Release Bot; DBpedia Information Extraction Framework; DBpedia Forum; DBpedia Spotlight) BabelNet (Babelfy) Dexter, an Open Source Framework for Entity Linking

Personal Data Management Platforms: MyData Global HAT (Hub-of-All-Things) (HATDeX - HAT Data Exchange Ltd, HAT Community Foundation (HCF), Documentation for Developers) World Data Exchange (digi.me) Aircloak openPDS/SafeAnswers: Personal Data with Privacy

Health Data: Universal Health Coverage (UHC) 20230 (Health Data Collaborative; UHC Data Portal) International Population Data Linkage Network (IPDLN) European Health Data Space (EHDS) European Institute for Innovation through Health Data (i~HD) European Institute for Health Records (EuroRec) emotion icon Data Insights for NHS England @ datainsights.uk AI Centre for Value Based Healthcare (SDE-AIC work; NHS Data Architecture & Data Flows) Association of Professional Healthcare Analysts (AphA) Health and Care Analytics Conference (HCAC) UK Health Data Research Alliance (UKHDRA) Health Data Research UK (HDR UK) Health Data Research Innovation Gateway NHS Research Secure Data Environment (SDE) Network @ HDR UK UK Health Data Research Alliance Data @ NHS Digital Data Uses Registers @ NHS Digital (data sharing agreements) NHS Digital Data Release Register - reformatted (third-party analysis showing whether past data releases respected opt outs or not) Health and social care @ ONS Health @ UK Data Service Fingertips | Public health data @ Office for Health Improvement & Disparities use MY data Understanding Patient Data (UPD) medConfidential Cancer Register @ Cancer Research UK Public Engagement in Data Research Initiative (PEDRI) DATA-CAN: The UK's Health Data Research Hub for Cancer UCL Institute of Health Equity (IHE) Public Engagement in Data Research Initiative (PEDRI) emotion icon NHS Kent and Medway (ICB) (Shared Health and Care Analytics Board (SHcAB) @ NHS Kent and Medway) Kent and Medway Integrated Care System Kent and Medway Care Record (KMCR) Kent and Medway Data Warehouse (KMDW) @ Maidstone and Tunbridge Wells NHS Trust (Kent Research Network for Education and Learning (KeRNEL))

False Information

Organizations, Tools and Resources: Zhijiang GuoAutomated-Fact-Checking-Resources @ Github Journalism, 'Fake News' and Disinformation: A Handbook for Journalism Education and Training (UNESCO) Combating the disinfodemic: Working for truth in the time of COVID-19 (UNESCO) Combating the Disinfodemic: Working for truth in the time of COVID-19 (UNESCO and UNITAR Divisions for Multilateral Diplomacy and Prosperity's mobile e-learning course) WHO's Information Network for Epidemics (EPI-WIN) W3C Credible Web Community Group (Github, Credible Web CG Area-2 (Corroboration-Based Strategies)) International Center for Journalists (ICFJ) (International Journalists' Network (IJNet), A Short Guide to the History of ‘Fake News’ and Disinformation: A New ICFJ Learning Module) EU DisinfoLab Hatebase emotion icon Google News Initiative (GNI) (Digital News Innovation (DNI) Fund, GNI Innovation Challenges) WikiCred Truth Decay @ RAND (Fighting Disinformation Online: A Database of Web Tools) NewsGuard (Firefox extension - NewsGuard, Google Chrome extension - NewsGuard, Firefox extension - HealthGuard, Android app - NewsGuard; COVID-19 Misinformation Resources, Coronavirus Misinformation Tracking Center) CheckStep misinformation datasets @ data.world FakeNewsTracker Google Fact Check (Google Fact Check Tools API, Google Fact Check Explorer, Google Fact Check Markup Tool) Fact-check Feed @ fact.pubmedia.us SMAT: The Social Media Analysis Toolkit Verifi! News Landscape (NELA) Toolkit Media Manipulation Casebook 台灣事實查核中心 (Taiwan FactCheck Center) Lead Stories emotion icon Full Fact First Draft AllSides (Media Bias Ratings, Media Bias Chart™, Rate Your Bias) Media Bias/Fact Check (MBFC) Science Media Centre MisinfoCon Credibility Coalition Center for Countering Digital Hate (CCDH) (Stop Funding Misinformation) Fake News Challenge (FNC) (Stance Detection dataset for FNC-1) Poynter Institute (International Fact-Checking Network - IFCN, IFCN Code of Principles; MediaWise Teen Fact-Checking Network (TFCN), #CoronaVirusFacts Alliance, CoronaVirusFacts/DatosCoronaVirus Alliance Database) Fairness & Accuracy In Reporting (FAIR) Content Authenticity Initiative (CAI) Project Origin: Protecting Trusted Media Coalition for Content Provenance and Authenticity (C2PA) Global Investigative Journalism Network (GIJN) (Bellingcat) European Fact-Checking Standards Network (EFCSN) BBC Disinformation Watch BBC Reality Check FactCheck @ Channel 4 News Fact check @ The Ferret The Reporters' Lab (Fact-Checking, The Duke Tech & Check Cooperative, ClaimReview) Truth or Fiction Check Your Fact FactsCan AFP Fact Check Africa Check Cambridge's Misinformation Susceptibility Test (MIST) Bad News game Go Viral! Inquiring Online emotion icon Snopes TruthOrFiction.com FactCheck.org PolitiFact (Politifact Fact Check Dataset 2008-22) Global Disinformation Index (GDI) Suggest Fact Checker @ The Washington Post COVID-19 Misinformation Newsletter @ Programme on Democracy and Technology (DemTech), Oxford University Prism Metanews (Anti-Misinformation Resources: The Catalog) DISARM Foundation DisinfoDocket Misinformation Exposure (Nature Communications 2022) PressDB Truth Social emotion icon Arkose Labs (Inventory fraud, Fake account creation) Lie Detectors Fakespot ReviewMeta TheReviewIndex Yelp Open Dataset UCSD McAuley Lab's Amazon Reviews'23 dataset (2018 dataset, 2014 dataset, 2013 dataset) Fake Reviews Dataset (Journal of Retailing and Consumer Services 2022) YelpCHI dataset (ICWSM 2013 and KDD 2015) YelpZip dataset (KDD 2015 and SIAM SDM 2016) Yelp-Fraud (Multi-relational Graph Dataset for Yelp Spam Review Detection) (CIKM 2020) Amazon-Fraud (Multi-relational Graph Dataset for Amazon Fraudulent Account Detection) (CIKM 2020) emotion icon Masterpiece Generator - Song Lyrics Generator Tweetgen fake-resume-generator

Multimedia False Information: JPEG Fake Media Awesome-Misinfo-Video-Detection Awesome Deepfakes Awesome Deepfakes Detection fake-face-detection: some collected paper and personal notes relevant to Fake Face Detetection DeepFake-o-meter: An open platform integrating state-of-the-art DeepFake detection methods StyleGAN StyleGAN2 StyleGAN2-ADA (Official TensorFlow implementation) StyleGAN2-ADA (Official PyTorch implementation) StyleGAN3 DeepFaceLab DeepFaceLive OpenAI's DALL·E 2 (Stability Generator API @ GitHub) thiscatdoesnotexist.com thishorsedoesnotexist.com thisartworkdoesnotexist.com thischemicaldoesnotexist.com thesecatsdonotexist.com thisanimedoesnotexist.ai thisponydoesnotexist.net thiswaifudoesnotexist.net whichfaceisreal.com spotdeepfakes.org generated.photos (datasets, face fenerator, free generated photos) DeepFaceDrawing: Deep Generation of Face Images from Sketches (SIGGRAPH 2020) (DeepFaceDrawing-Jittor @ GitHub) DeepNude-an-Image-to-Image-technology pix2pix: Image-to-Image Translation with Conditional Adversarial Nets (CVPR 2017) (CycleGAN and pix2pix in PyTorch @ GitHub, original code @ GitHub, Christopher Hesse's interactive demo) Deepware Scanner (GitHub) Flickr-Faces-HQ Dataset (FFHQ) Adversarial Deepfakes (WACV 2021) DefakeHop (ICME 2021) DeepFaceLab Faceswap (GitHub) MyFakeApp (based on Faceswap) ZAO App Reface App WOMBO Botika MyVoiceYourFace.com Virtual Humans FaceForensics Benchmark Partnership on AI's AI and Media Integrity Steering Committee (Deepfake Detection Challenge = DFDC) WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection Celeb-DF (v2): A New Dataset for DeepFake Forensics (CVPR 2020) (GitHub) KoDF: A Large-scale Korean DeepFake Detection Dataset (CVPR 2021) CoMoFoD - Image Database for Copy-Move Forgery Detection Copy-Move Forgery Database with Similar but Genuine Objects (COVERAGE) GANDCTAnalysis
MKLab-ITI's image-verification-corpus Assembler (Google's project) A corpus of debunked and verified user-generated videos (Online Information Review 2019) Fake-EmoReact 2021 Challenge @ SocialNLP 2021 EmotionGIF 2020 Challenge @ SocialNLP 2020 Truepic emotion icon NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media (EMNLP 2021) Visual News: Benchmark and Challenges in News Image Captioning (EMNLP 2021)

Other Research Related: LLMs Meet Misinformation WeVerify GATE (General Architecture for Text Engineering) PHEME project PAN (a series of scientific events and shared tasks on digital text forensics and stylometry) emotion icon AVeriTeC @ NeurISP 2023 CLEF2020 CheckThat! Lab (Enabling Automatic Identification and Verification of Claims in Social Media) CLEF2019 CheckThat! Lab CLEF2018 CheckThat! Lab FEVER Datasets (scientific claims) ClaimBuster: Automated Live Fact-checking (ClaimPortal, ICWSM 2020 dataset) Claim Detection in Social Media via Fusion of Transformer and Syntactic Features (CLEF CheckThat! 2020) ClaimsKG claim-rank (RANLP 2017) Claim Extraction for Scientific Publications SciFact (GitHub) Too Many Claims to Fact-Check: Prioritizing Political Claims Based on Check-Worthiness (MAISoN'2020 @ CIKM'2020) entity-fishing - Entity Recognition and Disambiguation Full Fact's Fast & Furious Fact Check Challenge (2016) emotion icon Iffy.news (Iffy Index of Unreliable Sources, Wayback Workshop) OSoMe (Observatory on Social Media) @ Network Science Institute (IUNI), Center for Complex Networks and Systems Research (CNetS), Indiana University (Tools and Datasets: Hoaxy®, Fakey, Botometer, BotSlayer, CoVaxxy, EchoDemo) emotion icon Graph-based Fraud Detection Papers and Resources VoterFraud2020 (@ GitHub, @ Fighshare) FakeNewsNet Maciej Szpakowski's Fake News Corpus Fakeddit (GitHub) Credibilator (Google Chrome extension) Dichotomies of Disinformation EX-FEVER: A Dataset for Multi-hop Explainable Fact Verification (2023) MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset (SIGIR 2022) LOCO: the 88-million word language of conspiracy corpus (2021) Avax (anti-vaccine) tweets dataset (2021) The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively? (SIGIR 2020 + ECIR 2020 + CIKM 2020) ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research (CIKM 2020) FakeCovid: Fact Checked data for COVID-19 (ICWSM 2020 workshop) Dataset for COVID-19 Misinformation on Twitter (2020) CHECKED: Chinese COVID-19 Fake News Dataset (2020) Factuality and Bias Prediction of News Media (ACL 2020 + EMNLP 2018) FakeHealth repository (ICWSM 2020) FiveThirtyEight's dataset of 3 million Russian troll tweets Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board (ICWSM 2020) Learning from Fact-checkers (SIGIR 2019) The Rise of Guardians (SIGIR 2018) LIAR-PLUS fake news databse (FEVER 2018) LIAR fake news databse (ACL 2017) CREDBANK-data (ICWSM 2015) emotion icon 中文谣言数据 (中国科学: 信息科学 2015)

Information Visualization

Tools: Transparency Vis

More

Tools: GetOldTweets-python GetOldTweets-java GetOldTweets3
Data: COVID-19 @ Aminer (COVID-19 Open Datasets, dashboard)

Disclaimer

All information on this website is for personal use and Shujun Li is not responsible for any misuse of information provided. The listed links on any page do not indicate any personal recommendations for any purposes for the visitors of this website, as each link is included for a different reason meaningful for Shujun Li's personal use. Logo files of websites are used to facilitate recognition of the external links, and does not represent endorsement of the corresponding websites for the content of this website. If the use of any logo file violates the copyrights or policies of any individuals or organisations, please contact Shujun Li so that he can removes the logo file or the whole link. Please also help report broken links and broken images on this website.