23 Commits

Author SHA1 Message Date
Ali Parlakçı
d6194c57d9 Merge pull request #67 from dbanon87/dbanon87/erome-downloads
fix erome download URLs
2019-10-17 09:51:02 +03:00
Ali Parlakçı
cd87a4a120 Merge pull request #68 from dbanon87/dbanon87/gitignore-env
add env/ to gitignore
2019-10-17 09:49:36 +03:00
dbanon87
08cddf4c83 add env/ to gitignore
This allows working in a virtualenv in the project directory.
2019-10-08 10:52:56 -04:00
dbanon87
88fa9e742d fix erome download URLs 2019-10-08 10:51:56 -04:00
Ali Parlakçı
1c17f174a8 typo 2019-04-23 17:00:42 +03:00
Ali Parlakçı
9b36336ac3 typo 2019-04-23 16:51:05 +03:00
Ali Parlakçı
35e551f20c Update README.md 2019-04-23 14:04:15 +03:00
Ali Parlakçı
0f2bda9c34 Merge pull request #63 from aliparlakci/moreUsefulReadme
A more useful readme (credits to *stared*)
2019-04-23 14:00:53 +03:00
Ali Parlakçı
8ab694bcc1 Fixed typo 2019-04-23 13:59:01 +03:00
Ali
898f59d035 Added an FAQ entry 2019-04-23 13:51:21 +03:00
Ali
6b6db37185 Minor corrections 2019-04-23 13:29:58 +03:00
Piotr Migdał
d4a5100128 a clearer description how to run it (#62) 2019-04-23 13:17:15 +03:00
Ali
22047338e2 Update version number 2019-04-09 20:45:22 +03:00
Ali
b16cdd3cbb Hopefully, fixed the config.json bug 2019-04-09 20:31:42 +03:00
Ali
2a8394a48c Fixed the bug concerning config.json 2019-04-08 22:09:52 +03:00
Ali Parlakçı
eac4404bbf Update README.md 2019-03-31 11:59:49 +03:00
Ali Parlakci
fae49d50da Update version 2019-03-31 11:46:03 +03:00
Ali Parlakci
7130525ece Update version 2019-03-31 11:35:27 +03:00
Ali Parlakci
2bf1e03ee1 Update version 2019-03-31 11:33:29 +03:00
Ali
15a91e5784 Fixed saving auth info problem 2019-02-24 12:28:40 +03:00
Ali
344201a70d Fixed v.redd.it links 2019-02-23 00:01:39 +03:00
Ali
92e47adb43 Update version 2019-02-22 23:59:57 +03:00
Ali
4d385fda60 Fixed v.redd.it links 2019-02-22 23:59:03 +03:00
9 changed files with 168 additions and 40 deletions

3
.gitignore vendored
View File

@@ -3,4 +3,5 @@ dist/
MANIFEST MANIFEST
__pycache__/ __pycache__/
src/__pycache__/ src/__pycache__/
config.json config.json
env/

125
README.md
View File

@@ -1,9 +1,11 @@
# Bulk Downloader for Reddit # Bulk Downloader for Reddit
Downloads media from reddit posts.
Downloads media from reddit posts. Made by [u/aliparlakci](https://reddit.com/u/aliparlakci)
## [Download the latest release here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest) ## [Download the latest release here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest)
## What it can do ## What it can do
- Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links - Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
- Sorts posts by hot, top, new and so on - Sorts posts by hot, top, new and so on
- Downloads **REDDIT** images and videos, **IMGUR** images and albums, **GFYCAT** links, **EROME** images and albums, **SELF POSTS** and any link to a **DIRECT IMAGE** - Downloads **REDDIT** images and videos, **IMGUR** images and albums, **GFYCAT** links, **EROME** images and albums, **SELF POSTS** and any link to a **DIRECT IMAGE**
@@ -13,33 +15,124 @@ Downloads media from reddit posts.
- Saves a reusable copy of posts' details that are found so that they can be re-downloaded again - Saves a reusable copy of posts' details that are found so that they can be re-downloaded again
- Logs failed ones in a file to so that you can try to download them later - Logs failed ones in a file to so that you can try to download them later
## **Compiling it from source code** ## Installation
MacOS users have to use this option. See *[here](docs/COMPILE_FROM_SOURCE.md)*
## Additional options You can use it either as a `bulk-downloader-for-reddit.exe` executable file for Windows, as a Linux binary or as a *[Python script](#python-script)*. There is no MacOS executable, MacOS users must use the Python script option.
Script also accepts additional options via command-line arguments. Get further information from **[`--help`](docs/COMMAND_LINE_ARGUMENTS.md)**
### Executables
For Windows and Linux, [download the latest executables, here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest).
### Python script
* Download this repository ([latest zip](https://github.com/aliparlakci/bulk-downloader-for-reddit/archive/master.zip) or `git clone git@github.com:aliparlakci/bulk-downloader-for-reddit.git`).
* Enter its folder.
* Run `python ./script.py` from the command-line (Windows, MacOSX or Linux command line; it may work with Anaconda prompt) See [here](docs/INTERPRET_FROM_SOURCE.md#finding-the-correct-keyword-for-python) if you have any trouble with this step.
It uses Python 3.6 and above. It won't work with Python 3.5 or any Python 2.x. If you have a trouble setting it up, see [here](docs/INTERPRET_FROM_SOURCE.md).
### Setting up the script
## Setting up the script
You need to create an imgur developer app in order API to work. Go to https://api.imgur.com/oauth2/addclient and fill the form (It does not really matter how you fill it). You need to create an imgur developer app in order API to work. Go to https://api.imgur.com/oauth2/addclient and fill the form (It does not really matter how you fill it).
It should redirect you to a page where it shows your **imgur_client_id** and **imgur_client_secret**. It should redirect you to a page where it shows your **imgur_client_id** and **imgur_client_secret**.
When you run it for the first time, it will automatically create `config.json` file containing `imgur_client_id`, `imgur_client_secret`, `reddit_username` and `reddit_refresh_token`.
## Running
You can run it it an interactive mode, or using [command-line arguments](docs/COMMAND_LINE_ARGUMENTS.md) (also available via `python ./script.py --help` or `bulk-downloader-for-reddit.exe --help`).
To run the interactive mode, simply use `python ./script.py` or double click on `bulk-downloader-for-reddit.exe` without any extra commands.
### [Example for command line arguments](docs/COMMAND_LINE_ARGUMENTS.md#examples)
### Example for an interactive script
```
(py37) bulk-downloader-for-reddit user$ python ./script.py
Bulk Downloader for Reddit v1.6.5
Written by Ali PARLAKCI parlakciali@gmail.com
https://github.com/aliparlakci/bulk-downloader-for-reddit/
download directory: downloads/dataisbeautiful_last_few
select program mode:
[1] search
[2] subreddit
[3] multireddit
[4] submitted
[5] upvoted
[6] saved
[7] log
[0] exit
> 2
(type frontpage for all subscribed subreddits,
use plus to seperate multi subreddits: pics+funny+me_irl etc.)
subreddit: dataisbeautiful
select sort type:
[1] hot
[2] top
[3] new
[4] rising
[5] controversial
[0] exit
> 1
limit (0 for none): 50
GETTING POSTS
(1/24) r/dataisbeautiful
AutoModerator_[Battle]_DataViz_Battle_for_the_month_of_April_2019__Visualize_the_April_Fool's_Prank_for_2019-04-01_on__r_DataIsBeautiful_b8ws37.md
Downloaded
(2/24) r/dataisbeautiful
AutoModerator_[Topic][Open]_Open_Discussion_Monday_—_Anybody_can_post_a_general_visualization_question_or_start_a_fresh_discussion!_bg1wej.md
Downloaded
...
Total of 24 links downloaded!
Press enter to quit
```
## FAQ ## FAQ
### I am running the script on a headless machine or on a remote server. How can I authenticate my reddit account?
- Download the script on your everday computer and run it for once.
- Authenticate the program on both reddit and imgur.
- Go to your Home folder (for Windows users it is `C:\Users\[USERNAME]\`, for Linux users it is `/home/[USERNAME]`)
- Copy the *config.json* file inside the Bulk Downloader for Reddit folder and paste it **next to** the file that you run the program.
### How can I change my credentials? ### How can I change my credentials?
- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit - All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit them, there.
them, there.
Also if you already have a config.json file, you can paste it **next to** the script and override the one on your Home directory.
### What do the dots resemble when getting posts? ### What do the dots resemble when getting posts?
- Each dot means that 100 posts are scanned. - Each dot means that 100 posts are scanned.
### Getting posts takes too long. ### Getting posts takes too long.
- You can press *Ctrl+C* to interrupt it and start downloading. - You can press *Ctrl+C* to interrupt it and start downloading.
### How are the filenames formatted? ### How are the filenames formatted?
- **Self posts** and **images** that do not belong to an album and **album folders** are formatted as: - **Self posts** and **images** that do not belong to an album and **album folders** are formatted as:
`[SUBMITTER NAME]_[POST TITLE]_[REDDIT ID]` `[SUBMITTER NAME]_[POST TITLE]_[REDDIT ID]`
You can use *reddit id* to go to post's reddit page by going to link reddit.com/[REDDIT ID] You can use *reddit id* to go to post's reddit page by going to link reddit.com/[REDDIT ID]
- An **image in an album** is formatted as: - An **image in an album** is formatted as:
`[ITEM NUMBER]_[IMAGE TITLE]_[IMGUR ID]` `[ITEM NUMBER]_[IMAGE TITLE]_[IMGUR ID]`
Similarly, you can use *imgur id* to go to image's imgur page by going to link imgur.com/[IMGUR ID]. Similarly, you can use *imgur id* to go to image's imgur page by going to link imgur.com/[IMGUR ID].
@@ -50,4 +143,6 @@ It should redirect you to a page where it shows your **imgur_client_id** and **i
However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
## [See the changes on *master* here](docs/CHANGELOG.md) ## Changelog
* [See the changes on *master* here](docs/CHANGELOG.md)

View File

@@ -1,4 +1,7 @@
# Changes on *master* # Changes on *master*
## [23/02/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/4d385fda60028343be816eb7c4f7bc613a9d555d)
- Fixed v.redd.it links
## [27/01/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/b7baf07fb5998368d87e3c4c36aed40daf820609) ## [27/01/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/b7baf07fb5998368d87e3c4c36aed40daf820609)
- Clarified the instructions - Clarified the instructions
@@ -80,4 +83,4 @@
## [10/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/ffe3839aee6dc1a552d95154d817aefc2b66af81) ## [10/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/ffe3839aee6dc1a552d95154d817aefc2b66af81)
- Added support for *self* post - Added support for *self* post
- Now getting posts is quicker - Now getting posts is quicker

View File

@@ -1,6 +1,6 @@
# Using command-line arguments # Using command-line arguments
See **[compiling from source](COMPILE_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](COMPILE_FROM_SOURCE.md#using-terminal) and come back. See **[compiling from source](INTERPRET_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](INTERPRET_FROM_SOURCE.md#using-terminal) and come back.
***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***. ***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***.
```console ```console
@@ -98,4 +98,4 @@ python script.py --directory C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER
# FAQ # FAQ
## I can't startup the script no matter what. ## I can't startup the script no matter what.
See **[finding the correct keyword for Python](COMPILE_FROM_SOURCE.md#finding-the-correct-keyword-for-python)** See **[finding the correct keyword for Python](INTERPRET_FROM_SOURCE.md#finding-the-correct-keyword-for-python)**

View File

@@ -1,4 +1,4 @@
# Compiling from source code # Interpret from source code
## Requirements ## Requirements
### Python 3 Interpreter ### Python 3 Interpreter
- This program is designed to work best on **Python 3.6.5** and this version of Python 3 is suggested. See if it is already installed, [here](#finding-the-correct-keyword-for-python). - This program is designed to work best on **Python 3.6.5** and this version of Python 3 is suggested. See if it is already installed, [here](#finding-the-correct-keyword-for-python).
@@ -6,11 +6,11 @@
## Using terminal ## Using terminal
### To open it... ### To open it...
- **On Windows**: Press **Shift+Right Click**, select **Open Powershell window here** or **Open Command Prompt window here** - **on Windows**: Press **Shift+Right Click**, select **Open Powershell window here** or **Open Command Prompt window here**
- **On Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T**. - **on Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T**.
- **On MacOS**: Look for an app called **Terminal**. - **on MacOS**: Look for an app called **Terminal**.
### Navigating to the directory where script is downloaded ### Navigating to the directory where script is downloaded
Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything) Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)

View File

@@ -23,7 +23,7 @@ from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,
__author__ = "Ali Parlakci" __author__ = "Ali Parlakci"
__license__ = "GPL" __license__ = "GPL"
__version__ = "1.6.4.1" __version__ = "1.6.5"
__maintainer__ = "Ali Parlakci" __maintainer__ = "Ali Parlakci"
__email__ = "parlakciali@gmail.com" __email__ = "parlakciali@gmail.com"
@@ -672,10 +672,15 @@ def main():
except ProgramModeError as err: except ProgramModeError as err:
PromptUser() PromptUser()
if not Path(GLOBAL.configDirectory).is_dir(): if not Path(GLOBAL.defaultConfigDirectory).is_dir():
os.makedirs(GLOBAL.configDirectory) os.makedirs(GLOBAL.defaultConfigDirectory)
GLOBAL.config = getConfig("config.json") if Path("config.json").exists() \
else getConfig(GLOBAL.configDirectory / "config.json") if Path("config.json").exists():
GLOBAL.configDirectory = Path("config.json")
else:
GLOBAL.configDirectory = GLOBAL.defaultConfigDirectory / "config.json"
GLOBAL.config = getConfig(GLOBAL.configDirectory)
if GLOBAL.arguments.log is not None: if GLOBAL.arguments.log is not None:
logDir = Path(GLOBAL.arguments.log) logDir = Path(GLOBAL.arguments.log)

View File

@@ -117,7 +117,9 @@ class Erome:
post["postSubmitter"]+"_"+title+"_"+post['postId']+".tmp" post["postSubmitter"]+"_"+title+"_"+post['postId']+".tmp"
) )
imageURL = "https:" + IMAGES[0] imageURL = IMAGES[0]
if 'https://' not in imageURL and 'http://' not in imageURL:
imageURL = "https://" + imageURL
try: try:
getFile(fileDir,tempDir,imageURL) getFile(fileDir,tempDir,imageURL)
@@ -146,7 +148,9 @@ class Erome:
extension = getExtension(IMAGES[i]) extension = getExtension(IMAGES[i])
fileName = str(i+1) fileName = str(i+1)
imageURL = "https:" + IMAGES[i] imageURL = IMAGES[i]
if 'https://' not in imageURL and 'http://' not in imageURL:
imageURL = "https://" + imageURL
fileDir = folderDir / (fileName + extension) fileDir = folderDir / (fileName + extension)
tempDir = folderDir / (fileName + ".tmp") tempDir = folderDir / (fileName + ".tmp")

View File

@@ -3,6 +3,8 @@ import sys
import random import random
import socket import socket
import webbrowser import webbrowser
import urllib.request
from urllib.error import HTTPError
import praw import praw
from prawcore.exceptions import NotFound, ResponseException, Forbidden from prawcore.exceptions import NotFound, ResponseException, Forbidden
@@ -93,7 +95,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes) authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
reddit = authorizedInstance[0] reddit = authorizedInstance[0]
refresh_token = authorizedInstance[1] refresh_token = authorizedInstance[1]
jsonFile(GLOBAL.configDirectory / "config.json").add({ jsonFile(GLOBAL.configDirectory).add({
"reddit_username":str(reddit.user.me()), "reddit_username":str(reddit.user.me()),
"reddit_refresh_token":refresh_token "reddit_refresh_token":refresh_token
}) })
@@ -103,7 +105,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes) authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
reddit = authorizedInstance[0] reddit = authorizedInstance[0]
refresh_token = authorizedInstance[1] refresh_token = authorizedInstance[1]
jsonFile(GLOBAL.configDirectory / "config.json").add({ jsonFile(GLOBAL.configDirectory).add({
"reddit_username":str(reddit.user.me()), "reddit_username":str(reddit.user.me()),
"reddit_refresh_token":refresh_token "reddit_refresh_token":refresh_token
}) })
@@ -422,18 +424,20 @@ def checkIfMatching(submission):
eromeCount += 1 eromeCount += 1
return details return details
elif isDirectLink(submission.url) is not False:
details['postType'] = 'direct'
details['postURL'] = isDirectLink(submission.url)
directCount += 1
return details
elif submission.is_self: elif submission.is_self:
details['postType'] = 'self' details['postType'] = 'self'
details['postContent'] = submission.selftext details['postContent'] = submission.selftext
selfCount += 1 selfCount += 1
return details return details
directLink = isDirectLink(submission.url)
if directLink is not False:
details['postType'] = 'direct'
details['postURL'] = directLink
directCount += 1
return details
def printSubmission(SUB,validNumber,totalNumber): def printSubmission(SUB,validNumber,totalNumber):
"""Print post's link, title and media link to screen""" """Print post's link, title and media link to screen"""
@@ -473,7 +477,22 @@ def isDirectLink(URL):
return URL return URL
elif "v.redd.it" in URL: elif "v.redd.it" in URL:
return URL+"/DASH_600_K" bitrates = ["DASH_1080","DASH_720","DASH_600", \
"DASH_480","DASH_360","DASH_240"]
for bitrate in bitrates:
videoURL = URL+"/"+bitrate
try:
responseCode = urllib.request.urlopen(videoURL).getcode()
except urllib.error.HTTPError:
responseCode = 0
if responseCode == 200:
return videoURL
else:
return False
for extension in imageTypes: for extension in imageTypes:
if extension in URL: if extension in URL:

View File

@@ -14,7 +14,8 @@ class GLOBAL:
config = None config = None
arguments = None arguments = None
directory = None directory = None
configDirectory = Path.home() / "Bulk Downloader for Reddit" defaultConfigDirectory = Path.home() / "Bulk Downloader for Reddit"
configDirectory = ""
reddit_client_id = "BSyphDdxYZAgVQ" reddit_client_id = "BSyphDdxYZAgVQ"
reddit_client_secret = "bfqNJaRh8NMh-9eAr-t4TRz-Blk" reddit_client_secret = "bfqNJaRh8NMh-9eAr-t4TRz-Blk"
printVanilla = print printVanilla = print