diff --git a/README.md b/README.md index 2ef7135..9f2a2d9 100644 --- a/README.md +++ b/README.md @@ -5,23 +5,28 @@ This program downloads imgur, gfycat and direct image and video links of saved p ## Table of Contents +- [What it can do?](#what-it-can-do) - [Requirements](#requirements) - [Setting up the script](#setting-up-the-script) - [Creating an imgur app](#creating-an-imgur-app) - [Program Modes](#program-modes) - - [saved mode](#saved-mode) - - [submitted mode](#submitted-mode) - - [upvoted mode](#upvoted-mode) - - [subreddit mode](#subreddit-mode) - - [multireddit mode](#multireddit-mode) - - [link mode](#link-mode) - - [log read mode](#log-read-mode) - [Running the script](#running-the-script) - [Using the command line arguments](#using-the-command-line-arguments) - [Examples](#examples) - [FAQ](#faq) - [Changelog](#changelog) - - [release-1.0.0](#release-100) + +## What it can do? +### It... +- can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links +- sorts post by hot, top, new and so on +- downloads imgur albums, gfycat links, [self posts](#i-can-t-open-the-self-posts-) and any link to a direct image +- skips the existing ones +- puts post titles to file's name +- puts every post to its subreddit's folder +- saves reusable a copy of posts' details that are found so that they can be re-downloaded again +- logs failed ones in a file to so that you can try to download them later +- can be run with double-clicking on Windows (but I don't recommend it) ## Requirements - Python 3.x* @@ -49,38 +54,27 @@ It should redirect to a page which shows your **imgur_client_id** and **imgur_cl ## Program Modes All the program modes are activated with command-line arguments as shown [here](#using-the-command-line-arguments) -### saved mode -In saved mode, the program gets posts from given user's saved posts. -### submitted mode -In submitted mode, the program gets posts from given user's submitted posts. -### upvoted mode -In submitted mode, the program gets posts from given user's upvoted posts. -### subreddit mode -In subreddit mode, the program gets posts from given subreddits* that is sorted by given type and limited by given number. - -Multiple subreddits can be given - -*You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments).* -### multireddit mode -In multireddit mode, the program gets posts from given user's given multireddit that is sorted by given type and limited by given number. -### link mode -In link mode, the program gets posts from given reddit link. - -You may customize the behaviour with `--sort`, `--time`, `--limit`. - -*You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments).* - -## log read mode -Two log files are created each time *script.py* runs. -- **POSTS** Saves all the posts without filtering. -- **FAILED** Keeps track of posts that are tried to be downloaded but failed. - -In log mode, the program takes a log file which created by itself, reads posts and tries downloading them again. - -Running log read mode for FAILED.json file once after the download is complete is **HIGHLY** recommended as unexpected problems may occur. +- **saved mode** + - Gets posts from given user's saved posts. +- **submitted mode** + - Gets posts from given user's submitted posts. +- **upvoted mode** + - Gets posts from given user's upvoted posts. +- **subreddit mode** + - Gets posts from given subreddit or subreddits that is sorted by given type and limited by given number. + - You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments). +- **multireddit mode** + - Gets posts from given user's given multireddit that is sorted by given type and limited by given number. +- **link mode** + - Gets posts from given reddit link. + - You may customize the behaviour with `--sort`, `--time`, `--limit`. + - You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments). +- **log read mode** + - Takes a log file which created by itself (json files), reads posts and tries downloading them again. + - Running log read mode for FAILED.json file once after the download is complete is **HIGHLY** recommended as unexpected problems may occur. ## Running the script -**WARNING** *DO NOT* let more than *1* instance of script run as it interferes with IMGUR Request Rate. +**DO NOT** let more than one instance of the script run as it interferes with IMGUR Request Rate. ### Using the command line arguments If no arguments are passed program will prompt you for arguments below which means you may start up the script with double-clicking on it (at least on Windows for sure). @@ -89,7 +83,7 @@ Open up the [terminal](https://www.reddit.com/r/NSFW411/comments/8vtnl8/meta_i_m Run the script.py file from terminal with command-line arguments. Here is the help page: -**ATTENTION** Use `.\` for current directory and `..\` for upper directory when using short directories, otherwise it might act weird. +Use `.\` for current directory and `..\` for upper directory when using short directories, otherwise it might act weird. ```console $ py -3 script.py --help @@ -166,6 +160,10 @@ py -3 script.py C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER\\FAILED.json ### I can't startup the script no matter what. - Try `python3` or `python` or `py -3` as python have real issues about naming their program +### I can't open the self posts. +- Self posts are held at subreddit as Markdown. So, the script downloads them as Markdown in order not to lose their stylings. However, there is a great Chrome extension [here](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with Chrome. + ## Changelog -### v1.0.0 -- Initial release +### 10/07/2018 +- Added support for *self* post +- Now getting posts is quicker diff --git a/script.py b/script.py index 6747ca3..c7ffbf2 100644 --- a/script.py +++ b/script.py @@ -11,7 +11,7 @@ import sys import time from pathlib import Path, PurePath -from src.downloader import Direct, Gfycat, Imgur +from src.downloader import Direct, Gfycat, Imgur, Self from src.parser import LinkDesigner from src.searcher import getPosts from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector, @@ -451,7 +451,22 @@ def download(submissions): print(exception) FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]}) downloadedCount -= 1 - + + elif submissions[i]['postType'] == 'self': + print("SELF") + try: + Self(directory,submissions[i]) + + except FileAlreadyExistsError: + print("It already exists") + downloadedCount -= 1 + duplicates += 1 + + except Exception as exception: + print(exception) + FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]}) + downloadedCount -= 1 + else: print("No match found, skipping...") downloadedCount -= 1 diff --git a/src/downloader.py b/src/downloader.py index 5a2df24..3c21f32 100644 --- a/src/downloader.py +++ b/src/downloader.py @@ -1,3 +1,4 @@ +import io import os import sys import urllib.request @@ -16,7 +17,7 @@ except ModuleNotFoundError: install("imgurpython") from imgurpython import * - +VanillaPrint = print print = printToFile def dlProgress(count, blockSize, totalSize): @@ -294,3 +295,45 @@ class Direct: tempDir = directory / (POST['postId']+".tmp") getFile(fileDir,tempDir,POST['postURL']) + +class Self: + def __init__(self,directory,post): + if not os.path.exists(directory): os.makedirs(directory) + + title = nameCorrector(post['postTitle']) + print(title+"_"+post['postId']+".md") + + fileDir = title+"_"+post['postId']+".md" + fileDir = directory / fileDir + + if Path.is_file(fileDir): + raise FileAlreadyExistsError + + try: + self.writeToFile(fileDir,post) + except FileNotFoundError: + fileDir = post['postId']+".md" + fileDir = directory / fileDir + + self.writeToFile(fileDir,post) + + @staticmethod + def writeToFile(directory,post): + + content = ("## [" + + post["postTitle"] + + "](" + + post["postURL"] + + ")\n" + + post["postContent"] + + "\n\n---\n\n" + + "submitted by [u/" + + post["postSubmitter"] + + "](https://www.reddit.com/user/" + + post["postSubmitter"] + + ")") + + with io.open(directory,"w",encoding="utf-8") as FILE: + VanillaPrint(content,file=FILE) + + print("Downloaded") diff --git a/src/searcher.py b/src/searcher.py index 7568f39..6ac6fbe 100644 --- a/src/searcher.py +++ b/src/searcher.py @@ -308,6 +308,10 @@ def redditSearcher(posts,SINGLE_POST=False): imgurCount = 0 global directCount directCount = 0 + global selfCount + selfCount = 0 + + allPosts = {} postsFile = createLogFile("POSTS") @@ -356,13 +360,15 @@ def redditSearcher(posts,SINGLE_POST=False): printSubmission(submission,subCount,orderCount) subList.append(details) - postsFile.add({subCount:[details]}) + allPosts = {**allPosts,**details} + + postsFile.add(allPosts) if not len(subList) == 0: print( "\nTotal of {} submissions found!\n"\ - "{} GFYCATs, {} IMGURs and {} DIRECTs\n" - .format(len(subList),gfycatCount,imgurCount,directCount) + "{} GFYCATs, {} IMGURs, {} DIRECTs and {} SELF POSTS\n" + .format(len(subList),gfycatCount,imgurCount,directCount,selfCount) ) return subList else: @@ -372,6 +378,7 @@ def checkIfMatching(submission): global gfycatCount global imgurCount global directCount + global selfCount try: details = {'postId':submission.id, @@ -397,13 +404,15 @@ def checkIfMatching(submission): imgurCount += 1 return details - elif isDirectLink(submission.url) is True: + elif isDirectLink(submission.url): details['postType'] = 'direct' directCount += 1 return details elif submission.is_self: details['postType'] = 'self' + details['postContent'] = submission.selftext + selfCount += 1 return details def printSubmission(SUB,validNumber,totalNumber):