36 Commits

Author SHA1 Message Date
Ali Parlakçı
cc93aa3012 Update commit link 2018-07-19 15:47:28 +03:00
Ali Parlakci
50c4a8d6d7 Update version 2018-07-19 15:38:49 +03:00
Ali Parlakci
5737904a54 Merge branch 'master' of https://github.com/aliparlakci/bulk-downloader-for-reddit 2018-07-19 15:35:46 +03:00
Ali Parlakci
f6eba6c5b0 Added more gfycat links 2018-07-19 15:34:58 +03:00
Ali Parlakci
41cbb58db3 Added more gfycat links 2018-07-19 15:22:12 +03:00
Ali Parlakçı
c569124406 Update Changelog 2018-07-19 14:58:17 +03:00
Ali Parlakçı
1a3836a8e1 Added v.redd.it support (#36) 2018-07-19 14:57:16 +03:00
Ali Parlakci
fde6a1fac4 Added custom exception descriptions to FAILED.json file 2018-07-19 14:56:00 +03:00
Ali Parlakci
6bba2c4dbb Merge branch 'master' of https://github.com/aliparlakci/bulk-downloader-for-reddit 2018-07-18 09:17:48 +03:00
Ali Parlakci
a078d44236 Edited FAQ 2018-07-18 09:17:36 +03:00
Ali Parlakçı
deae0be769 Delete _config.yml 2018-07-15 10:54:40 +03:00
Ali Parlakci
3cf0203e6b Typo fix 2018-07-13 14:39:01 +03:00
Ali Parlakci
0b31db0e2e Update FAQ 2018-07-13 14:37:35 +03:00
Ali Parlakci
d3f2b1b08e Update executables' names 2018-07-13 14:34:20 +03:00
Ali Parlakci
0ec4bb3008 Update version 2018-07-13 14:18:18 +03:00
Ali Parlakci
0dbe2ed917 Update changelog 2018-07-13 14:13:39 +03:00
Ali Parlakci
9f831e1b78 Added .exe to executable's extension 2018-07-13 14:12:17 +03:00
Ali Parlakci
59012077e1 Excludes build folders 2018-07-13 14:11:41 +03:00
Ali Parlakci
5e3c79160b Changed config.json path 2018-07-13 14:10:21 +03:00
Ali Parlakci
1e8eaa1a8d Update changelog 2018-07-12 23:05:13 +03:00
Ali Parlakci
7dbc83fdce Initial commit 2018-07-12 23:03:00 +03:00
Ali Parlakci
50a77f6ba5 Update version 2018-07-12 22:19:46 +03:00
Ali Parlakci
4f7e406cd6 Take multiple subreddits 2018-07-12 22:00:43 +03:00
Ali Parlakci
ded3cece8c Update changelog 2018-07-12 21:16:20 +03:00
Ali Parlakci
dd671fd738 Accept exit to exit the program when taking arguments 2018-07-12 21:09:31 +03:00
Ali Parlakci
b357dff52c Added more examples 2018-07-12 14:31:39 +03:00
Ali Parlakci
32ffd3b861 Added dependency installation 2018-07-12 14:27:16 +03:00
Ali Parlakci
02673c3950 Update changelog 2018-07-12 14:15:26 +03:00
Ali Parlakci
8448e47080 Wait on KeyboardInterrupt 2018-07-12 14:14:54 +03:00
Ali Parlakci
39f2c73f4c Another option to open terminal added 2018-07-12 13:35:45 +03:00
Ali Parlakci
fe942b4734 Added linux executable guide 2018-07-12 13:32:59 +03:00
Ali Parlakci
205617e051 Changed quit() to sys.exit() 2018-07-12 13:00:02 +03:00
Ali Parlakci
b93b206a96 Typo fix 2018-07-12 12:31:32 +03:00
Ali Parlakci
b84684f786 Check if installed 2018-07-12 12:30:11 +03:00
Ali Parlakci
68558950ca Broken links fixed 2018-07-12 12:26:46 +03:00
aliparlakci
795965f754 Readme refactor (#35)
* Shorten the README.md file

* Added more information and guides

* Typo fix

* Rename sections
2018-07-12 12:25:09 +03:00
10 changed files with 224 additions and 83 deletions

5
.gitignore vendored
View File

@@ -1,4 +1,5 @@
build/
dist/
MANIFEST
__pycache__/
src/__pycache__/
logs/
*.json

View File

@@ -3,20 +3,29 @@ This program downloads imgur, gfycat and direct image and video links of saved p
**PLEASE** post any issue you have with the script to [Issues](https://github.com/aliparlakci/bulk-downloader-for-reddit/issues) tab. Since I don't have any testers or contributers I need your feedback.
## What can it do?
### It...
- can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
- sorts posts by hot, top, new and so on
- downloads imgur albums, gfycat links, [self posts](#i-cant-open-the-self-posts) and any link to a direct image
- skips the existing ones
- puts post titles to file's name
- puts every post to its subreddit's folder
- saves a reusable copy of posts' details that are found so that they can be re-downloaded again
- logs failed ones in a file to so that you can try to download them later
- can be run with double-clicking on Windows (but I don't recommend it)
## What it can do
- Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
- Sorts posts by hot, top, new and so on
- Downloads imgur albums, gfycat links, [self posts](#how-do-i-open-self-post-files) and any link to a direct image
- Skips the existing ones
- Puts post titles to file's name
- Puts every post to its subreddit's folder
- Saves a reusable copy of posts' details that are found so that they can be re-downloaded again
- Logs failed ones in a file to so that you can try to download them later
- Can run with double-clicking on Windows
## [Download the latest release](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest)
## How it works
- For **Windows** and **Linux** users, there are executable files to run easily without installing a third party program. But if you are a paranoid like me, you can **[compile it from source code](docs/COMPILE_FROM_SOURCE.md)**.
- In Windows, double click on bulk-downloader-for-reddit file
- In Linux, extract files to a folder and open terminal inside it. Type **`./bulk-downloader-for-reddit`**
- **MacOS** users have to **[compile it from source code](docs/COMPILE_FROM_SOURCE.md)**.
Script also accepts **command-line arguments**, get further information from **[`--help`](docs/COMMAND_LINE_ARGUMENTS.md)**
## Setting up the script
Because this is not a commercial app, you need to create an imgur developer app in order API to work.
@@ -32,20 +41,32 @@ It should redirect to a page which shows your **imgur_client_id** and **imgur_cl
\* Select **OAuth 2 authorization without a callback URL** first then select **Anonymous usage without user authorization** if it says *Authorization callback URL: required*
## Running the script
For **Windows** users, there is an *EXE* file to run easily.
**Linux** and **MacOS** users have to install Python 3 and run it from the *source code* through terminal.
To get further information about that and **using command-line arguments to run the script**, see **[`python script.py --help`](docs/help_page.md)**
## FAQ
### I can't open the self post files.
### How do I open self post files?
- Self posts are held at reddit as styled with markdown. So, the script downloads them as they are in order not to lose their stylings.
However, there is a great Chrome extension [here](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with Chrome.
However, there is a [great Chrome extension](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with [Chrome](https://www.google.com/intl/tr/chrome/).
However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
### How can I change my credentials?
- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit
them, there.
## Changelog
### [19/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/50c4a8d6d7e21d9b44a6d6d00c1811cfe9c655b1)
- Added v.redd.it support
- Added custom exception descriptions to FAILED.json file
- Fixed the bug that prevents downloading some gfycat URLs
### [13/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/9f831e1b784a770c82252e909462871401a05c11)
- Change config.json file's path to home directory
### [12/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/50a77f6ba54c24f5647d5ea4e177400b71ff04a7)
- Added binaries for Windows and Linux
- Wait on KeyboardInterrupt
- Accept multiple subreddit input
- Fixed the bug that prevents choosing "[0] exit" with typing "exit"
### [11/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/a28a7776ab826dea2a8d93873a94cd46db3a339b)
- Improvements on UX and UI
- Added logging errors to CONSOLE_LOG.txt

View File

@@ -1,5 +0,0 @@
theme: jekyll-theme-minimal
show_downloads: false
#title: Bulk Downloader for Reddit
description: Code written by Ali PARLAKCI
google_analytics: UA-80780721-3

View File

@@ -1,6 +1,10 @@
## python script.py --help
# Using command-line arguments
See **[compiling from source](COMPILE_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](COMPILE_FROM_SOURCE.md#using-terminal) and come back.
***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***.
```console
$ python script.py --help
usage: script.py [-h] [--directory DIRECTORY] [--link link] [--saved]
[--submitted] [--upvoted] [--log LOG FILE]
[--subreddit SUBREDDIT [SUBREDDIT ...]]
@@ -37,7 +41,7 @@ optional arguments:
downloading later
```
## Examples
# Examples
- **Use `python3` instead of `python` if you are using *MacOS* or *Linux***
@@ -46,40 +50,49 @@ python script.py
```
```console
python script.py .\\NEW_FOLDER --sort new --time all --limit 10 --link "https://www.reddit.com/r/gifs/search?q=dogs&restrict_sr=on&type=link&sort=new&t=month"
.\bulk-downloader-for-reddit.exe
```
```console
python script.py .\\NEW_FOLDER --link "https://www.reddit.com/r/learnprogramming/comments/7mjw12/"
python script.py
```
```console
python script.py .\\NEW_FOLDER --search cats --sort new --time all --subreddit gifs pics --NoDownload
.\bulk-downloader-for-reddit.exe -- directory .\\NEW_FOLDER --search cats --sort new --time all --subreddit gifs pics --NoDownload
```
```console
python script.py .\\NEW_FOLDER --user [USER_NAME] --submitted --limit 10
./bulk-downloader-for-reddit --directory .\\NEW_FOLDER\\ANOTHER_FOLDER --saved --limit 1000
```
```console
python script.py .\\NEW_FOLDER --multireddit good_subs --user [USER_NAME] --sort top --time week --limit 250
python script.py --directory .\\NEW_FOLDER --sort new --time all --limit 10 --link "https://www.reddit.com/r/gifs/search?q=dogs&restrict_sr=on&type=link&sort=new&t=month"
```
```console
python script.py .\\NEW_FOLDER\\ANOTHER_FOLDER --saved --limit 1000
python script.py --directory .\\NEW_FOLDER --link "https://www.reddit.com/r/learnprogramming/comments/7mjw12/"
```
```console
python script.py C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER\\FAILED.json
python script.py --directory .\\NEW_FOLDER --search cats --sort new --time all --subreddit gifs pics --NoDownload
```
## FAQ
### I can't startup the script no matter what.
- Try these:
- **`python`**
- **`python3`**
- **`python3.7`**
- **`python3.6`**
- **`py -3`**
```console
python script.py --directory .\\NEW_FOLDER --user [USER_NAME] --submitted --limit 10
```
Python have real issues about naming their program
```console
python script.py --directory .\\NEW_FOLDER --multireddit good_subs --user [USER_NAME] --sort top --time week --limit 250
```
```console
python script.py --directory .\\NEW_FOLDER\\ANOTHER_FOLDER --saved --limit 1000
```
```console
python script.py --directory C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER\\FAILED.json
```
# FAQ
## I can't startup the script no matter what.
See **[finding the correct keyword for Python](COMPILE_FROM_SOURCE.md#finding-the-correct-keyword-for-python)**

View File

@@ -0,0 +1,42 @@
# Compiling from source code
## Requirements
### Python 3 Interpreter
Latest* version of **Python 3** is needed. See if it is already installed [here](#finding-the-correct-keyword-for-python). If not, download the matching release for your platform [here](https://www.python.org/downloads/) and install it. If you are a *Windows* user, selecting **Add Python 3 to PATH** option is mandatory.
\* *Use Python 3.6.5 if you encounter an issue*
## Using terminal
### To open it...
- **On Windows 8/8.1/10**: Press the File tab on **Windows Explorer**, click on **Open Windows PowerShell** or **Open Windows Command Prompt** or look for *Command Prompt* or *PowerShell* in *Start Menu*.
- **On Windows 7**: Press **WindowsKey+R**, type **cmd** and hit Enter or look for *Command Prompt* or *PowerShell* in *Start Menu*.
- **On Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T** or look for **Terminal** in the programs.
- **On MacOS**: Look for an app called **Terminal**.
### Navigating to the directory where script is downloaded
Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)
## Finding the correct keyword for Python
Enter these lines to terminal window until it prints out the version you have downloaded and installed:
- `python --version`
- `python3 --version`
- `python3.7 --version`
- `python3.6 --version`
- `py --version`
- `py -3 --version`
- `py -3.6 --version`
- `py -3.7 --version`
Once it does, your keyword is without the `--version` part.
## Installing dependencies
Enter the line below to terminal window when you are in the directory where script.py is, use your keyword for Python:
```console
python -m pip install -r requirements.txt
```
---
Now, you can go to [Using command-line arguments](COMMAND_LINE_ARGUMENTS.md)

View File

@@ -22,7 +22,7 @@ from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,
__author__ = "Ali Parlakci"
__license__ = "GPL"
__version__ = "1.1.0"
__version__ = "1.2.0"
__maintainer__ = "Ali Parlakci"
__email__ = "parlakciali@gmail.com"
@@ -194,12 +194,12 @@ class PromptUser:
))
print(" "*4+"[0] exit\n")
choice = input("> ")
while not choice.lower() in choices+choicesByIndex:
while not choice.lower() in choices+choicesByIndex+["exit"]:
print("Invalid input\n")
programModeIndex = input("> ")
if choice == "0":
quit()
if choice == "0" or choice == "exit":
sys.exit()
elif choice in choicesByIndex:
return choices[int(choice)-1]
else:
@@ -232,10 +232,20 @@ class PromptUser:
GLOBAL.arguments.time = timeFilter
if programMode == "subreddit":
GLOBAL.arguments.subreddit = input("\nsubreddit: ")
subredditInput = input("subreddit: ")
GLOBAL.arguments.subreddit = subredditInput
while not subredditInput == "":
subredditInput = input("subreddit: ")
GLOBAL.arguments.subreddit += "+" + subredditInput
if " " in GLOBAL.arguments.subreddit:
GLOBAL.arguments.subreddit = "+".join(GLOBAL.arguments.subreddit.split())
# DELETE THE PLUS (+) AT THE END
GLOBAL.arguments.subreddit = GLOBAL.arguments.subreddit[:-1]
print("\nselect sort type:")
sortTypes = [
"hot","top","new","rising","controversial"
@@ -389,7 +399,7 @@ def postFromLog(fileName):
content = jsonFile(fileName).read()
else:
print("File not found")
quit()
sys.exit()
try:
del content["HEADER"]
@@ -497,7 +507,7 @@ def download(submissions):
"Imgur login failed. Quitting the program "\
"as unexpected errors might occur."
)
quit()
sys.exit()
except Exception as exception:
print(exception)
@@ -528,7 +538,7 @@ def download(submissions):
downloadedCount -= 1
except NotADownloadableLinkError as exception:
print("Could not read the page source")
print(exception)
FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]})
downloadedCount -= 1
@@ -596,46 +606,47 @@ def main():
PromptUser()
except Exception as err:
print(err)
quit()
GLOBAL.config = getConfig("config.json")
sys.exit()
if not Path(GLOBAL.configDirectory).is_dir():
os.makedirs(GLOBAL.configDirectory)
GLOBAL.config = getConfig(GLOBAL.configDirectory / "config.json")
if GLOBAL.arguments.log is not None:
logDir = Path(GLOBAL.arguments.log)
download(postFromLog(logDir))
quit()
sys.exit()
try:
POSTS = getPosts(prepareAttributes())
except InsufficientPermission:
print("You do not have permission to do that")
quit()
sys.exit()
except NoMatchingSubmissionFound:
print("No matching submission was found")
quit()
sys.exit()
except NoRedditSupoort:
print("Reddit does not support that")
quit()
sys.exit()
except NoPrawSupport:
print("PRAW does not support that")
quit()
sys.exit()
except MultiredditNotFound:
print("Multireddit not found")
quit()
sys.exit()
except InvalidSortingType:
print("Invalid sorting type has given")
quit()
sys.exit()
except InvalidRedditLink:
print("Invalid reddit link")
quit()
sys.exit()
if POSTS is None:
print("I could not find any posts in that URL")
quit()
sys.exit()
if GLOBAL.arguments.NoDownload:
quit()
sys.exit()
else:
download(POSTS)
@@ -654,7 +665,6 @@ if __name__ == "__main__":
if GLOBAL.directory is None:
GLOBAL.directory = Path(".\\")
print("\nQUITTING...")
quit()
except Exception as exception:
logging.error("Runtime error!", exc_info=full_exc_info(sys.exc_info()))
print(log_stream.getvalue())

50
setup.py Normal file
View File

@@ -0,0 +1,50 @@
#!C:\Users\Ali\AppData\Local\Programs\Python\Python36\python.exe
## python setup.py build
import sys
from cx_Freeze import setup, Executable
from script import __version__
options = {
"build_exe": {
"packages":[
"idna","imgurpython", "praw", "requests"
]
}
}
if sys.platform == "win32":
executables = [Executable(
"script.py",
targetName="bulk-downloader-for-reddit.exe",
shortcutName="Bulk Downloader for Reddit",
shortcutDir="DesktopFolder"
)]
elif sys.platform == "linux":
executables = [Executable(
"script.py",
targetName="bulk-downloader-for-reddit",
shortcutName="Bulk Downloader for Reddit",
shortcutDir="DesktopFolder"
)]
setup(
name = "Bulk Downloader for Reddit",
version = __version__,
description = "Bulk Downloader for Reddit",
author = "Ali Parlakci",
author_email="parlakciali@gmail.com",
url="https://github.com/aliparlakci/bulk-downloader-for-reddit",
classifiers=(
"Programming Language :: Python :: 3",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)"
"Natural Language :: English",
"Environment :: Console",
"Operating System :: OS Independent",
),
executables = executables,
options = options
)

View File

@@ -36,7 +36,10 @@ def getExtension(link):
if TYPE in parsed:
return "."+parsed[-1]
else:
return '.jpg'
if not "v.redd.it" in link:
return '.jpg'
else:
return '.mp4'
def getFile(fileDir,tempDir,imageURL,indent=0):
"""Downloads given file to given directory.
@@ -169,7 +172,9 @@ class Imgur:
if duplicates == imagesLenght:
raise FileAlreadyExistsError
elif howManyDownloaded < imagesLenght:
raise AlbumNotDownloadedCompletely
raise AlbumNotDownloadedCompletely(
"Album Not Downloaded Completely"
)
@staticmethod
def initImgur():
@@ -217,9 +222,9 @@ class Gfycat:
try:
POST['mediaURL'] = self.getLink(POST['postURL'])
except IndexError:
raise NotADownloadableLinkError
raise NotADownloadableLinkError("Could not read the page source")
except Exception as exception:
raise NotADownloadableLinkError
raise NotADownloadableLinkError("Could not read the page source")
POST['postExt'] = getExtension(POST['mediaURL'])
@@ -248,8 +253,7 @@ class Gfycat:
if url[-1:] == '/':
url = url[:-1]
if 'gifs' in url:
url = "https://gfycat.com/" + url.split('/')[-1]
url = "https://gfycat.com/" + url.split('/')[-1]
pageSource = (urllib.request.urlopen(url).read().decode().split('\n'))
@@ -266,7 +270,7 @@ class Gfycat:
break
if "".join(link) == "":
raise NotADownloadableLinkError
raise NotADownloadableLinkError("Could not read the page source")
return "".join(link)

View File

@@ -89,7 +89,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
reddit = authorizedInstance[0]
refresh_token = authorizedInstance[1]
jsonFile("config.json").add({
jsonFile(GLOBAL.configDirectory / "config.json").add({
"reddit_refresh_token":refresh_token
})
else:
@@ -98,7 +98,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
reddit = authorizedInstance[0]
refresh_token = authorizedInstance[1]
jsonFile("config.json").add({
jsonFile(GLOBAL.configDirectory / "config.json").add({
"reddit_refresh_token":refresh_token
})
return reddit
@@ -397,8 +397,9 @@ def checkIfMatching(submission):
imgurCount += 1
return details
elif isDirectLink(submission.url):
elif isDirectLink(submission.url) is not None:
details['postType'] = 'direct'
details['postURL'] = isDirectLink(submission.url)
directCount += 1
return details
@@ -435,7 +436,7 @@ def printSubmission(SUB,validNumber,totalNumber):
def isDirectLink(URL):
"""Check if link is a direct image link.
If so, return True,
If so, return URL,
if not, return False
"""
@@ -444,10 +445,13 @@ def isDirectLink(URL):
URL = URL[:-1]
if "i.reddituploads.com" in URL:
return True
return URL
elif "v.redd.it" in URL:
return URL+"/DASH_600_K"
for extension in imageTypes:
if extension in URL:
return True
return URL
else:
return False

View File

@@ -14,6 +14,7 @@ class GLOBAL:
config = None
arguments = None
directory = None
configDirectory = Path.home() / "Bulk Downloader for Reddit"
reddit_client_id = "BSyphDdxYZAgVQ"
reddit_client_secret = "bfqNJaRh8NMh-9eAr-t4TRz-Blk"
printVanilla = print