Du kan inte välja fler än 25 ämnen Ämnen måste starta med en bokstav eller siffra, kan innehålla bindestreck ('-') och vara max 35 tecken långa.
= 610c9e986c added functionality to filter by a specific user 1 månad sedan
images fixed bug where recent posts will not show the full date 1 månad sedan
scraper added functionality to filter by a specific user 1 månad sedan
.gitignore Added plain-text output functionality. 1 månad sedan
LICENSE Added readme/license/other standard junk 7 månader sedan
README.md added functionality to filter by a specific user 1 månad sedan

README.md

Kiwi Scraper

Python-based webscraper designed to scrape kiwifarms.net threads and filter them according to post ratings or user.

This program allows users to filter by:

  • positive ratings
  • negative ratings
  • neutral ratings
  • overall ratings
  • specific rating
  • weighted score (positive ratgins have positive values, negative ratings have negative values)
  • specific user

The program outputs its findings via either JSON, text file, or plain text.

Examples:

json output

plain text output

Required Packages

This program requires the following packages:

pip install bs4
pip install requests

Usage

Run scraper.py in /scraper to initialize the program.

Example run:

Welcome to the Kiwi Scraper!

Please provide the link to the thread you want analyzed below.

Please note that this program will start searching at the first threadpage that you link to, so if you'd like the thread analyzed starting at the frst page, please link to the first page of the thread; otherwise, provide a link to the first page you want scraped.

If the link that you provide is valid but not working, try using the .nl/.pl domains, as cloudflare might be blocking your request.

: https://kiwifarms.net/threads/russell-greer-theofficialinstaofrussellgreer.30488/

Fetching thread...
------------------------------------------
Thread: Russell Greer / @theofficialinstaofrussellgreer - Swift-Obsessed Sex Pest, Magical Star Buddy
------------------------------------------
Would you like to stop at a certain page (y/n)?: y
What page do you want to stop at?: 1
------------------------------------------
How would you like the thread to be filtered?

(1) positive ratings
(2) negative ratings
(3) neutral ratings
(4) total ratings
(5) specific rating
(6) weighted score (positive ratings count as positive points, negative ratings count as negative points, and neutral ratings don't count)
(7) specific user

: 4
------------------------------------------
Enter a minumum number of total ratings.
: 20

Grab some popcorn, this might take a while...


POST FOUND - User: Cryin RN | Date: May 1, 2017 | #1
Total Reactions: 178  Weighted: 167  Positive: 170  Negative: 3  Neutral: 3
Like: 16  Agree: 2  Winner: 44  Informative: 101  Feels: 1  Islamic Content: 2  Autistic: 3  Horrifying: 1  Semper Fidelis: 6  DRINK!: 2

POST FOUND - User: Kugelsak Kastengrus 6th | Date: May 1, 2017 | #2
Total Reactions: 38  Weighted: 38  Positive: 38  Negative: 0 
 Neutral: 0
Like: 2  Agree: 29  Informative: 4  Feels: 3

/you get the idea/ 

Scraping finished.
Posts found: 15
------------------------------------------
Please enter directory to save output to.
: /path/to/directory
What would you like to name your JSON file?
: filename

Creating file...
Successful.
File saved to path\to\directory\filename.json

Upcoming

  • I plan on implementing functionality to allow for formated text output to be included. This will include special markings for quotes, media, and other special content that would exist within a post. Currently this is left blank in the JSON output.

Contributing

Pull requests are welcome. I’m always open to new ideas.