mirror of
https://github.com/KevinMidboe/bulk-downloader-for-reddit.git
synced 2026-01-10 03:05:36 +00:00
Compare commits
23 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d6194c57d9 | ||
|
|
cd87a4a120 | ||
|
|
08cddf4c83 | ||
|
|
88fa9e742d | ||
|
|
1c17f174a8 | ||
|
|
9b36336ac3 | ||
|
|
35e551f20c | ||
|
|
0f2bda9c34 | ||
|
|
8ab694bcc1 | ||
|
|
898f59d035 | ||
|
|
6b6db37185 | ||
|
|
d4a5100128 | ||
|
|
22047338e2 | ||
|
|
b16cdd3cbb | ||
|
|
2a8394a48c | ||
|
|
eac4404bbf | ||
|
|
fae49d50da | ||
|
|
7130525ece | ||
|
|
2bf1e03ee1 | ||
|
|
15a91e5784 | ||
|
|
344201a70d | ||
|
|
92e47adb43 | ||
|
|
4d385fda60 |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -4,3 +4,4 @@ MANIFEST
|
|||||||
__pycache__/
|
__pycache__/
|
||||||
src/__pycache__/
|
src/__pycache__/
|
||||||
config.json
|
config.json
|
||||||
|
env/
|
||||||
|
|||||||
113
README.md
113
README.md
@@ -1,9 +1,11 @@
|
|||||||
# Bulk Downloader for Reddit
|
# Bulk Downloader for Reddit
|
||||||
Downloads media from reddit posts.
|
|
||||||
|
Downloads media from reddit posts. Made by [u/aliparlakci](https://reddit.com/u/aliparlakci)
|
||||||
|
|
||||||
## [Download the latest release here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest)
|
## [Download the latest release here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest)
|
||||||
|
|
||||||
## What it can do
|
## What it can do
|
||||||
|
|
||||||
- Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
|
- Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
|
||||||
- Sorts posts by hot, top, new and so on
|
- Sorts posts by hot, top, new and so on
|
||||||
- Downloads **REDDIT** images and videos, **IMGUR** images and albums, **GFYCAT** links, **EROME** images and albums, **SELF POSTS** and any link to a **DIRECT IMAGE**
|
- Downloads **REDDIT** images and videos, **IMGUR** images and albums, **GFYCAT** links, **EROME** images and albums, **SELF POSTS** and any link to a **DIRECT IMAGE**
|
||||||
@@ -13,21 +15,112 @@ Downloads media from reddit posts.
|
|||||||
- Saves a reusable copy of posts' details that are found so that they can be re-downloaded again
|
- Saves a reusable copy of posts' details that are found so that they can be re-downloaded again
|
||||||
- Logs failed ones in a file to so that you can try to download them later
|
- Logs failed ones in a file to so that you can try to download them later
|
||||||
|
|
||||||
## **Compiling it from source code**
|
## Installation
|
||||||
MacOS users have to use this option. See *[here](docs/COMPILE_FROM_SOURCE.md)*
|
|
||||||
|
|
||||||
## Additional options
|
You can use it either as a `bulk-downloader-for-reddit.exe` executable file for Windows, as a Linux binary or as a *[Python script](#python-script)*. There is no MacOS executable, MacOS users must use the Python script option.
|
||||||
Script also accepts additional options via command-line arguments. Get further information from **[`--help`](docs/COMMAND_LINE_ARGUMENTS.md)**
|
|
||||||
|
### Executables
|
||||||
|
|
||||||
|
For Windows and Linux, [download the latest executables, here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest).
|
||||||
|
|
||||||
|
### Python script
|
||||||
|
|
||||||
|
* Download this repository ([latest zip](https://github.com/aliparlakci/bulk-downloader-for-reddit/archive/master.zip) or `git clone git@github.com:aliparlakci/bulk-downloader-for-reddit.git`).
|
||||||
|
* Enter its folder.
|
||||||
|
* Run `python ./script.py` from the command-line (Windows, MacOSX or Linux command line; it may work with Anaconda prompt) See [here](docs/INTERPRET_FROM_SOURCE.md#finding-the-correct-keyword-for-python) if you have any trouble with this step.
|
||||||
|
|
||||||
|
It uses Python 3.6 and above. It won't work with Python 3.5 or any Python 2.x. If you have a trouble setting it up, see [here](docs/INTERPRET_FROM_SOURCE.md).
|
||||||
|
|
||||||
|
|
||||||
|
### Setting up the script
|
||||||
|
|
||||||
## Setting up the script
|
|
||||||
You need to create an imgur developer app in order API to work. Go to https://api.imgur.com/oauth2/addclient and fill the form (It does not really matter how you fill it).
|
You need to create an imgur developer app in order API to work. Go to https://api.imgur.com/oauth2/addclient and fill the form (It does not really matter how you fill it).
|
||||||
|
|
||||||
It should redirect you to a page where it shows your **imgur_client_id** and **imgur_client_secret**.
|
It should redirect you to a page where it shows your **imgur_client_id** and **imgur_client_secret**.
|
||||||
|
|
||||||
|
When you run it for the first time, it will automatically create `config.json` file containing `imgur_client_id`, `imgur_client_secret`, `reddit_username` and `reddit_refresh_token`.
|
||||||
|
|
||||||
|
|
||||||
|
## Running
|
||||||
|
|
||||||
|
You can run it it an interactive mode, or using [command-line arguments](docs/COMMAND_LINE_ARGUMENTS.md) (also available via `python ./script.py --help` or `bulk-downloader-for-reddit.exe --help`).
|
||||||
|
|
||||||
|
To run the interactive mode, simply use `python ./script.py` or double click on `bulk-downloader-for-reddit.exe` without any extra commands.
|
||||||
|
|
||||||
|
### [Example for command line arguments](docs/COMMAND_LINE_ARGUMENTS.md#examples)
|
||||||
|
|
||||||
|
### Example for an interactive script
|
||||||
|
|
||||||
|
```
|
||||||
|
(py37) bulk-downloader-for-reddit user$ python ./script.py
|
||||||
|
|
||||||
|
Bulk Downloader for Reddit v1.6.5
|
||||||
|
Written by Ali PARLAKCI – parlakciali@gmail.com
|
||||||
|
|
||||||
|
https://github.com/aliparlakci/bulk-downloader-for-reddit/
|
||||||
|
|
||||||
|
download directory: downloads/dataisbeautiful_last_few
|
||||||
|
select program mode:
|
||||||
|
|
||||||
|
[1] search
|
||||||
|
[2] subreddit
|
||||||
|
[3] multireddit
|
||||||
|
[4] submitted
|
||||||
|
[5] upvoted
|
||||||
|
[6] saved
|
||||||
|
[7] log
|
||||||
|
[0] exit
|
||||||
|
|
||||||
|
> 2
|
||||||
|
(type frontpage for all subscribed subreddits,
|
||||||
|
use plus to seperate multi subreddits: pics+funny+me_irl etc.)
|
||||||
|
|
||||||
|
subreddit: dataisbeautiful
|
||||||
|
|
||||||
|
select sort type:
|
||||||
|
|
||||||
|
[1] hot
|
||||||
|
[2] top
|
||||||
|
[3] new
|
||||||
|
[4] rising
|
||||||
|
[5] controversial
|
||||||
|
[0] exit
|
||||||
|
|
||||||
|
> 1
|
||||||
|
|
||||||
|
limit (0 for none): 50
|
||||||
|
|
||||||
|
GETTING POSTS
|
||||||
|
|
||||||
|
|
||||||
|
(1/24) – r/dataisbeautiful
|
||||||
|
AutoModerator_[Battle]_DataViz_Battle_for_the_month_of_April_2019__Visualize_the_April_Fool's_Prank_for_2019-04-01_on__r_DataIsBeautiful_b8ws37.md
|
||||||
|
Downloaded
|
||||||
|
|
||||||
|
(2/24) – r/dataisbeautiful
|
||||||
|
AutoModerator_[Topic][Open]_Open_Discussion_Monday_—_Anybody_can_post_a_general_visualization_question_or_start_a_fresh_discussion!_bg1wej.md
|
||||||
|
Downloaded
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
Total of 24 links downloaded!
|
||||||
|
|
||||||
|
Press enter to quit
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## FAQ
|
## FAQ
|
||||||
|
|
||||||
|
### I am running the script on a headless machine or on a remote server. How can I authenticate my reddit account?
|
||||||
|
- Download the script on your everday computer and run it for once.
|
||||||
|
- Authenticate the program on both reddit and imgur.
|
||||||
|
- Go to your Home folder (for Windows users it is `C:\Users\[USERNAME]\`, for Linux users it is `/home/[USERNAME]`)
|
||||||
|
- Copy the *config.json* file inside the Bulk Downloader for Reddit folder and paste it **next to** the file that you run the program.
|
||||||
|
|
||||||
### How can I change my credentials?
|
### How can I change my credentials?
|
||||||
- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit
|
- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit them, there.
|
||||||
them, there.
|
|
||||||
|
Also if you already have a config.json file, you can paste it **next to** the script and override the one on your Home directory.
|
||||||
|
|
||||||
### What do the dots resemble when getting posts?
|
### What do the dots resemble when getting posts?
|
||||||
- Each dot means that 100 posts are scanned.
|
- Each dot means that 100 posts are scanned.
|
||||||
@@ -50,4 +143,6 @@ It should redirect you to a page where it shows your **imgur_client_id** and **i
|
|||||||
|
|
||||||
However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
|
However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
|
||||||
|
|
||||||
## [See the changes on *master* here](docs/CHANGELOG.md)
|
## Changelog
|
||||||
|
|
||||||
|
* [See the changes on *master* here](docs/CHANGELOG.md)
|
||||||
|
|||||||
@@ -1,4 +1,7 @@
|
|||||||
# Changes on *master*
|
# Changes on *master*
|
||||||
|
## [23/02/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/4d385fda60028343be816eb7c4f7bc613a9d555d)
|
||||||
|
- Fixed v.redd.it links
|
||||||
|
|
||||||
## [27/01/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/b7baf07fb5998368d87e3c4c36aed40daf820609)
|
## [27/01/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/b7baf07fb5998368d87e3c4c36aed40daf820609)
|
||||||
- Clarified the instructions
|
- Clarified the instructions
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# Using command-line arguments
|
# Using command-line arguments
|
||||||
|
|
||||||
See **[compiling from source](COMPILE_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](COMPILE_FROM_SOURCE.md#using-terminal) and come back.
|
See **[compiling from source](INTERPRET_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](INTERPRET_FROM_SOURCE.md#using-terminal) and come back.
|
||||||
|
|
||||||
***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***.
|
***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***.
|
||||||
```console
|
```console
|
||||||
@@ -98,4 +98,4 @@ python script.py --directory C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER
|
|||||||
|
|
||||||
# FAQ
|
# FAQ
|
||||||
## I can't startup the script no matter what.
|
## I can't startup the script no matter what.
|
||||||
See **[finding the correct keyword for Python](COMPILE_FROM_SOURCE.md#finding-the-correct-keyword-for-python)**
|
See **[finding the correct keyword for Python](INTERPRET_FROM_SOURCE.md#finding-the-correct-keyword-for-python)**
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
# Compiling from source code
|
# Interpret from source code
|
||||||
## Requirements
|
## Requirements
|
||||||
### Python 3 Interpreter
|
### Python 3 Interpreter
|
||||||
- This program is designed to work best on **Python 3.6.5** and this version of Python 3 is suggested. See if it is already installed, [here](#finding-the-correct-keyword-for-python).
|
- This program is designed to work best on **Python 3.6.5** and this version of Python 3 is suggested. See if it is already installed, [here](#finding-the-correct-keyword-for-python).
|
||||||
@@ -6,11 +6,11 @@
|
|||||||
|
|
||||||
## Using terminal
|
## Using terminal
|
||||||
### To open it...
|
### To open it...
|
||||||
- **On Windows**: Press **Shift+Right Click**, select **Open Powershell window here** or **Open Command Prompt window here**
|
- **on Windows**: Press **Shift+Right Click**, select **Open Powershell window here** or **Open Command Prompt window here**
|
||||||
|
|
||||||
- **On Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T**.
|
- **on Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T**.
|
||||||
|
|
||||||
- **On MacOS**: Look for an app called **Terminal**.
|
- **on MacOS**: Look for an app called **Terminal**.
|
||||||
|
|
||||||
### Navigating to the directory where script is downloaded
|
### Navigating to the directory where script is downloaded
|
||||||
Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)
|
Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)
|
||||||
15
script.py
15
script.py
@@ -23,7 +23,7 @@ from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,
|
|||||||
|
|
||||||
__author__ = "Ali Parlakci"
|
__author__ = "Ali Parlakci"
|
||||||
__license__ = "GPL"
|
__license__ = "GPL"
|
||||||
__version__ = "1.6.4.1"
|
__version__ = "1.6.5"
|
||||||
__maintainer__ = "Ali Parlakci"
|
__maintainer__ = "Ali Parlakci"
|
||||||
__email__ = "parlakciali@gmail.com"
|
__email__ = "parlakciali@gmail.com"
|
||||||
|
|
||||||
@@ -672,10 +672,15 @@ def main():
|
|||||||
except ProgramModeError as err:
|
except ProgramModeError as err:
|
||||||
PromptUser()
|
PromptUser()
|
||||||
|
|
||||||
if not Path(GLOBAL.configDirectory).is_dir():
|
if not Path(GLOBAL.defaultConfigDirectory).is_dir():
|
||||||
os.makedirs(GLOBAL.configDirectory)
|
os.makedirs(GLOBAL.defaultConfigDirectory)
|
||||||
GLOBAL.config = getConfig("config.json") if Path("config.json").exists() \
|
|
||||||
else getConfig(GLOBAL.configDirectory / "config.json")
|
if Path("config.json").exists():
|
||||||
|
GLOBAL.configDirectory = Path("config.json")
|
||||||
|
else:
|
||||||
|
GLOBAL.configDirectory = GLOBAL.defaultConfigDirectory / "config.json"
|
||||||
|
|
||||||
|
GLOBAL.config = getConfig(GLOBAL.configDirectory)
|
||||||
|
|
||||||
if GLOBAL.arguments.log is not None:
|
if GLOBAL.arguments.log is not None:
|
||||||
logDir = Path(GLOBAL.arguments.log)
|
logDir = Path(GLOBAL.arguments.log)
|
||||||
|
|||||||
@@ -117,7 +117,9 @@ class Erome:
|
|||||||
post["postSubmitter"]+"_"+title+"_"+post['postId']+".tmp"
|
post["postSubmitter"]+"_"+title+"_"+post['postId']+".tmp"
|
||||||
)
|
)
|
||||||
|
|
||||||
imageURL = "https:" + IMAGES[0]
|
imageURL = IMAGES[0]
|
||||||
|
if 'https://' not in imageURL and 'http://' not in imageURL:
|
||||||
|
imageURL = "https://" + imageURL
|
||||||
|
|
||||||
try:
|
try:
|
||||||
getFile(fileDir,tempDir,imageURL)
|
getFile(fileDir,tempDir,imageURL)
|
||||||
@@ -146,7 +148,9 @@ class Erome:
|
|||||||
extension = getExtension(IMAGES[i])
|
extension = getExtension(IMAGES[i])
|
||||||
|
|
||||||
fileName = str(i+1)
|
fileName = str(i+1)
|
||||||
imageURL = "https:" + IMAGES[i]
|
imageURL = IMAGES[i]
|
||||||
|
if 'https://' not in imageURL and 'http://' not in imageURL:
|
||||||
|
imageURL = "https://" + imageURL
|
||||||
|
|
||||||
fileDir = folderDir / (fileName + extension)
|
fileDir = folderDir / (fileName + extension)
|
||||||
tempDir = folderDir / (fileName + ".tmp")
|
tempDir = folderDir / (fileName + ".tmp")
|
||||||
|
|||||||
@@ -3,6 +3,8 @@ import sys
|
|||||||
import random
|
import random
|
||||||
import socket
|
import socket
|
||||||
import webbrowser
|
import webbrowser
|
||||||
|
import urllib.request
|
||||||
|
from urllib.error import HTTPError
|
||||||
|
|
||||||
import praw
|
import praw
|
||||||
from prawcore.exceptions import NotFound, ResponseException, Forbidden
|
from prawcore.exceptions import NotFound, ResponseException, Forbidden
|
||||||
@@ -93,7 +95,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
|
|||||||
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
|
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
|
||||||
reddit = authorizedInstance[0]
|
reddit = authorizedInstance[0]
|
||||||
refresh_token = authorizedInstance[1]
|
refresh_token = authorizedInstance[1]
|
||||||
jsonFile(GLOBAL.configDirectory / "config.json").add({
|
jsonFile(GLOBAL.configDirectory).add({
|
||||||
"reddit_username":str(reddit.user.me()),
|
"reddit_username":str(reddit.user.me()),
|
||||||
"reddit_refresh_token":refresh_token
|
"reddit_refresh_token":refresh_token
|
||||||
})
|
})
|
||||||
@@ -103,7 +105,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
|
|||||||
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
|
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
|
||||||
reddit = authorizedInstance[0]
|
reddit = authorizedInstance[0]
|
||||||
refresh_token = authorizedInstance[1]
|
refresh_token = authorizedInstance[1]
|
||||||
jsonFile(GLOBAL.configDirectory / "config.json").add({
|
jsonFile(GLOBAL.configDirectory).add({
|
||||||
"reddit_username":str(reddit.user.me()),
|
"reddit_username":str(reddit.user.me()),
|
||||||
"reddit_refresh_token":refresh_token
|
"reddit_refresh_token":refresh_token
|
||||||
})
|
})
|
||||||
@@ -422,18 +424,20 @@ def checkIfMatching(submission):
|
|||||||
eromeCount += 1
|
eromeCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
elif isDirectLink(submission.url) is not False:
|
|
||||||
details['postType'] = 'direct'
|
|
||||||
details['postURL'] = isDirectLink(submission.url)
|
|
||||||
directCount += 1
|
|
||||||
return details
|
|
||||||
|
|
||||||
elif submission.is_self:
|
elif submission.is_self:
|
||||||
details['postType'] = 'self'
|
details['postType'] = 'self'
|
||||||
details['postContent'] = submission.selftext
|
details['postContent'] = submission.selftext
|
||||||
selfCount += 1
|
selfCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
|
directLink = isDirectLink(submission.url)
|
||||||
|
|
||||||
|
if directLink is not False:
|
||||||
|
details['postType'] = 'direct'
|
||||||
|
details['postURL'] = directLink
|
||||||
|
directCount += 1
|
||||||
|
return details
|
||||||
|
|
||||||
def printSubmission(SUB,validNumber,totalNumber):
|
def printSubmission(SUB,validNumber,totalNumber):
|
||||||
"""Print post's link, title and media link to screen"""
|
"""Print post's link, title and media link to screen"""
|
||||||
|
|
||||||
@@ -473,7 +477,22 @@ def isDirectLink(URL):
|
|||||||
return URL
|
return URL
|
||||||
|
|
||||||
elif "v.redd.it" in URL:
|
elif "v.redd.it" in URL:
|
||||||
return URL+"/DASH_600_K"
|
bitrates = ["DASH_1080","DASH_720","DASH_600", \
|
||||||
|
"DASH_480","DASH_360","DASH_240"]
|
||||||
|
|
||||||
|
for bitrate in bitrates:
|
||||||
|
videoURL = URL+"/"+bitrate
|
||||||
|
|
||||||
|
try:
|
||||||
|
responseCode = urllib.request.urlopen(videoURL).getcode()
|
||||||
|
except urllib.error.HTTPError:
|
||||||
|
responseCode = 0
|
||||||
|
|
||||||
|
if responseCode == 200:
|
||||||
|
return videoURL
|
||||||
|
|
||||||
|
else:
|
||||||
|
return False
|
||||||
|
|
||||||
for extension in imageTypes:
|
for extension in imageTypes:
|
||||||
if extension in URL:
|
if extension in URL:
|
||||||
|
|||||||
@@ -14,7 +14,8 @@ class GLOBAL:
|
|||||||
config = None
|
config = None
|
||||||
arguments = None
|
arguments = None
|
||||||
directory = None
|
directory = None
|
||||||
configDirectory = Path.home() / "Bulk Downloader for Reddit"
|
defaultConfigDirectory = Path.home() / "Bulk Downloader for Reddit"
|
||||||
|
configDirectory = ""
|
||||||
reddit_client_id = "BSyphDdxYZAgVQ"
|
reddit_client_id = "BSyphDdxYZAgVQ"
|
||||||
reddit_client_secret = "bfqNJaRh8NMh-9eAr-t4TRz-Blk"
|
reddit_client_secret = "bfqNJaRh8NMh-9eAr-t4TRz-Blk"
|
||||||
printVanilla = print
|
printVanilla = print
|
||||||
|
|||||||
Reference in New Issue
Block a user