Merge pull request #67 from dbanon87/dbanon87/erome-downloads

fix erome download URLs
Merge pull request #68 from dbanon87/dbanon87/gitignore-env
2026-01-08 10:15:36 +00:00 · 2019-10-17 09:51:02 +03:00 · 2019-10-17 09:49:36 +03:00 · 2019-10-08 10:52:56 -04:00 · 2019-10-08 10:51:56 -04:00 · 2019-04-23 17:00:42 +03:00
10 changed files with 194 additions and 64 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -3,4 +3,5 @@ dist/
 MANIFEST
 __pycache__/
 src/__pycache__/
-config.json
+config.json
+env/
--- a/README.md
+++ b/README.md
@@ -1,9 +1,11 @@
 # Bulk Downloader for Reddit
-Downloads media from reddit posts.

-## [Download the latest release](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest)
+Downloads media from reddit posts. Made by [u/aliparlakci](https://reddit.com/u/aliparlakci)
+
+## [Download the latest release here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest)

 ## What it can do
+
 - Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
 - Sorts posts by hot, top, new and so on
 - Downloads **REDDIT** images and videos, **IMGUR** images and albums, **GFYCAT** links, **EROME** images and albums, **SELF POSTS** and any link to a **DIRECT IMAGE**
@@ -13,17 +15,134 @@ Downloads media from reddit posts.
 - Saves a reusable copy of posts' details that are found so that they can be re-downloaded again
 - Logs failed ones in a file to so that you can try to download them later

-## **[Compiling it from source code](docs/COMPILE_FROM_SOURCE.md)**
-*\* MacOS users have to use this option.*
+## Installation

-## Additional options
-Script also accepts additional options via command-line arguments. Get further information from **[`--help`](docs/COMMAND_LINE_ARGUMENTS.md)**
+You can use it either as a `bulk-downloader-for-reddit.exe` executable file for Windows, as a Linux binary or as a *[Python script](#python-script)*. There is no MacOS executable, MacOS users must use the Python script option. 
+
+### Executables
+
+For Windows and Linux, [download the latest executables, here](https://github.com/aliparlakci/bulk-downloader-for-reddit/releases/latest).
+
+### Python script
+
+* Download this repository ([latest zip](https://github.com/aliparlakci/bulk-downloader-for-reddit/archive/master.zip) or `git clone git@github.com:aliparlakci/bulk-downloader-for-reddit.git`).
+* Enter its folder.
+* Run `python ./script.py` from the command-line (Windows, MacOSX or Linux command line; it may work with Anaconda prompt) See [here](docs/INTERPRET_FROM_SOURCE.md#finding-the-correct-keyword-for-python) if you have any trouble with this step.
+
+It uses Python 3.6 and above. It won't work with Python 3.5 or any Python 2.x. If you have a trouble setting it up, see [here](docs/INTERPRET_FROM_SOURCE.md).
+
+
+### Setting up the script

-## Setting up the script
 You need to create an imgur developer app in order API to work. Go to https://api.imgur.com/oauth2/addclient and fill the form (It does not really matter how you fill it).
-  
-It should redirect you to a page where it shows your **imgur_client_id** and **imgur_client_secret**.
-  
-## [FAQ](docs/FAQ.md)

-## [Changes on *master*](docs/CHANGELOG.md)
+It should redirect you to a page where it shows your **imgur_client_id** and **imgur_client_secret**.
+
+When you run it for the first time, it will automatically create `config.json` file containing `imgur_client_id`, `imgur_client_secret`, `reddit_username` and `reddit_refresh_token`.
+
+
+## Running
+
+You can run it it an interactive mode, or using [command-line arguments](docs/COMMAND_LINE_ARGUMENTS.md) (also available via `python ./script.py --help` or `bulk-downloader-for-reddit.exe --help`).
+
+To run the interactive mode, simply use `python ./script.py` or double click on `bulk-downloader-for-reddit.exe` without any extra commands.
+
+### [Example for command line arguments](docs/COMMAND_LINE_ARGUMENTS.md#examples)
+
+### Example for an interactive script
+
+```
+(py37) bulk-downloader-for-reddit user$ python ./script.py
+
+Bulk Downloader for Reddit v1.6.5
+Written by Ali PARLAKCI – parlakciali@gmail.com
+
+https://github.com/aliparlakci/bulk-downloader-for-reddit/
+
+download directory: downloads/dataisbeautiful_last_few
+select program mode:
+
+    [1] search
+    [2] subreddit
+    [3] multireddit
+    [4] submitted
+    [5] upvoted
+    [6] saved
+    [7] log
+    [0] exit
+
+> 2
+(type frontpage for all subscribed subreddits,
+ use plus to seperate multi subreddits: pics+funny+me_irl etc.)
+
+subreddit: dataisbeautiful
+
+select sort type:
+
+    [1] hot
+    [2] top
+    [3] new
+    [4] rising
+    [5] controversial
+    [0] exit
+
+> 1
+
+limit (0 for none): 50
+
+GETTING POSTS
+
+
+(1/24) – r/dataisbeautiful
+AutoModerator_[Battle]_DataViz_Battle_for_the_month_of_April_2019__Visualize_the_April_Fool's_Prank_for_2019-04-01_on__r_DataIsBeautiful_b8ws37.md
+Downloaded
+
+(2/24) – r/dataisbeautiful
+AutoModerator_[Topic][Open]_Open_Discussion_Monday_—_Anybody_can_post_a_general_visualization_question_or_start_a_fresh_discussion!_bg1wej.md
+Downloaded
+
+...
+
+Total of 24 links downloaded!
+
+Press enter to quit
+```
+
+
+## FAQ
+
+### I am running the script on a headless machine or on a remote server. How can I authenticate my reddit account?
+- Download the script on your everday computer and run it for once.
+- Authenticate the program on both reddit and imgur.
+- Go to your Home folder (for Windows users it is `C:\Users\[USERNAME]\`, for Linux users it is `/home/[USERNAME]`)
+- Copy the *config.json* file inside the Bulk Downloader for Reddit folder and paste it **next to** the file that you run the program.
+
+### How can I change my credentials?
+- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit them, there.  
+
+  Also if you already have a config.json file, you can paste it **next to** the script and override the one on your Home directory. 
+
+### What do the dots resemble when getting posts?
+- Each dot means that 100 posts are scanned.
+
+### Getting posts takes too long.
+- You can press *Ctrl+C* to interrupt it and start downloading.
+
+### How are the filenames formatted?
+- **Self posts** and **images** that do not belong to an album and **album folders** are formatted as:  
+  `[SUBMITTER NAME]_[POST TITLE]_[REDDIT ID]`  
+  You can use *reddit id* to go to post's reddit page by going to link reddit.com/[REDDIT ID]
+
+- An **image in an album** is formatted as:  
+  `[ITEM NUMBER]_[IMAGE TITLE]_[IMGUR ID]`  
+  Similarly, you can use *imgur id* to go to image's imgur page by going to link imgur.com/[IMGUR ID].
+
+### How do I open self post files?
+- Self posts are held at reddit as styled with markdown. So, the script downloads them as they are in order not to lose their stylings.
+  However, there is a [great Chrome extension](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with [Chrome](https://www.google.com/intl/tr/chrome/).  
+
+  However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
+
+## Changelog
+
+* [See the changes on *master* here](docs/CHANGELOG.md)
--- a/docs/CHANGELOG.md
+++ b/docs/CHANGELOG.md
@@ -1,4 +1,7 @@
 # Changes on *master*
+## [23/02/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/4d385fda60028343be816eb7c4f7bc613a9d555d)
+- Fixed v.redd.it links
+
 ## [27/01/2019](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/b7baf07fb5998368d87e3c4c36aed40daf820609)
 - Clarified the instructions

@@ -80,4 +83,4 @@

 ## [10/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/ffe3839aee6dc1a552d95154d817aefc2b66af81)
 - Added support for *self* post
- Now getting posts is quicker
+- Now getting posts is quicker
--- a/docs/COMMAND_LINE_ARGUMENTS.md
+++ b/docs/COMMAND_LINE_ARGUMENTS.md
@@ -1,6 +1,6 @@
 # Using command-line arguments

-See **[compiling from source](COMPILE_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](COMPILE_FROM_SOURCE.md#using-terminal) and come back.
+See **[compiling from source](INTERPRET_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](INTERPRET_FROM_SOURCE.md#using-terminal) and come back.

 ***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***.
 ```console
@@ -98,4 +98,4 @@ python script.py --directory C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER

 # FAQ
 ## I can't startup the script no matter what.
-See **[finding the correct keyword for Python](COMPILE_FROM_SOURCE.md#finding-the-correct-keyword-for-python)**
+See **[finding the correct keyword for Python](INTERPRET_FROM_SOURCE.md#finding-the-correct-keyword-for-python)**
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -1,23 +0,0 @@
-# FAQ
-## What do the dots resemble when getting posts?
- Each dot means that 100 posts are scanned. 
-  
-## Getting posts is taking too long.
- You can press Ctrl+C to interrupt it and start downloading.
-  
-## How are filenames formatted?
- Self posts and images that are not belong to an album are formatted as **`[SUBMITTER NAME]_[POST TITLE]_[REDDIT ID]`**.
-  You can use *reddit id* to go to post's reddit page by going to link **reddit.com/[REDDIT ID]**
-  
- An image in an imgur album is formatted as **`[ITEM NUMBER]_[IMAGE TITLE]_[IMGUR ID]`**
-  Similarly, you can use *imgur id* to go to image's imgur page by going to link **imgur.com/[IMGUR ID]**.
-
-## How do I open self post files?
- Self posts are held at reddit as styled with markdown. So, the script downloads them as they are in order not to lose their stylings.
-  However, there is a [great Chrome extension](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with [Chrome](https://www.google.com/intl/tr/chrome/).  
-
-  However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
-
-## How can I change my credentials?
- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit 
-  them, there.
--- a/docs/INTERPRET_FROM_SOURCE.md
+++ b/docs/INTERPRET_FROM_SOURCE.md
@@ -1,16 +1,16 @@
-# Compiling from source code
+# Interpret from source code
 ## Requirements
 ### Python 3 Interpreter
-Latest* version of **Python 3** is needed. See if it is already installed [here](#finding-the-correct-keyword-for-python). If not, download the matching release for your platform [here](https://www.python.org/downloads/) and install it. If you are a *Windows* user, selecting **Add Python 3 to PATH** option when installing the software is mandatory.   
-  
-\* *Use Python 3.6.5 if you encounter an issue*
+- This program is designed to work best on **Python 3.6.5** and this version of Python 3 is suggested. See if it is already installed, [here](#finding-the-correct-keyword-for-python).  
+- If not, download the matching release for your platform [here](https://www.python.org/downloads/) and install it. If you are a *Windows* user, selecting **Add Python 3 to PATH** option when installing the software is mandatory.   
+
 ## Using terminal
 ### To open it...
-  **On Windows**: Press **Shift+Right Click**, select **Open Powershell window here** or **Open Command Prompt window here**
+-  **on Windows**: Press **Shift+Right Click**, select **Open Powershell window here** or **Open Command Prompt window here**
  
- **On Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T**.
+- **on Linux**: Right-click in a folder and select **Open Terminal** or press **Ctrl+Alt+T**.
  
- **On MacOS**: Look for an app called **Terminal**.
+- **on MacOS**: Look for an app called **Terminal**.
  
 ### Navigating to the directory where script is downloaded
 Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)
--- a/script.py
+++ b/script.py
@@ -23,7 +23,7 @@ from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,

 __author__ = "Ali Parlakci"
 __license__ = "GPL"
-__version__ = "1.6.4.1"
+__version__ = "1.6.5"
 __maintainer__ = "Ali Parlakci"
 __email__ = "parlakciali@gmail.com"

@@ -279,7 +279,8 @@ class PromptUser:
                GLOBAL.arguments.subreddit = "+".join(GLOBAL.arguments.subreddit.split())

            # DELETE THE PLUS (+) AT THE END
-            if not subredditInput.lower() == "frontpage":
+            if not subredditInput.lower() == "frontpage" \
+                and GLOBAL.arguments.subreddit[-1] == "+":
                GLOBAL.arguments.subreddit = GLOBAL.arguments.subreddit[:-1]

            print("\nselect sort type:")
@@ -671,10 +672,15 @@ def main():
    except ProgramModeError as err:
        PromptUser()

-    if not Path(GLOBAL.configDirectory).is_dir():
-        os.makedirs(GLOBAL.configDirectory)
-    GLOBAL.config = getConfig("config.json") if Path("config.json").exists() \
-                    else getConfig(GLOBAL.configDirectory / "config.json")
+    if not Path(GLOBAL.defaultConfigDirectory).is_dir():
+        os.makedirs(GLOBAL.defaultConfigDirectory)
+
+    if Path("config.json").exists():
+        GLOBAL.configDirectory = Path("config.json")
+    else:
+        GLOBAL.configDirectory = GLOBAL.defaultConfigDirectory  / "config.json"
+
+    GLOBAL.config = getConfig(GLOBAL.configDirectory)

    if GLOBAL.arguments.log is not None:
        logDir = Path(GLOBAL.arguments.log)
--- a/src/downloader.py
+++ b/src/downloader.py
@@ -117,7 +117,9 @@ class Erome:
                post["postSubmitter"]+"_"+title+"_"+post['postId']+".tmp"
            )

-            imageURL = "https:" + IMAGES[0]
+            imageURL = IMAGES[0]
+            if 'https://' not in imageURL and 'http://' not in imageURL:
+                imageURL = "https://" + imageURL

            try:
                getFile(fileDir,tempDir,imageURL)
@@ -146,7 +148,9 @@ class Erome:
                extension = getExtension(IMAGES[i])

                fileName = str(i+1)
-                imageURL = "https:" + IMAGES[i]
+                imageURL = IMAGES[i]
+                if 'https://' not in imageURL and 'http://' not in imageURL:
+                    imageURL = "https://" + imageURL

                fileDir = folderDir / (fileName + extension)
                tempDir = folderDir / (fileName + ".tmp")
--- a/src/searcher.py
+++ b/src/searcher.py
@@ -3,6 +3,8 @@ import sys
 import random
 import socket
 import webbrowser
+import urllib.request
+from urllib.error import HTTPError

 import praw
 from prawcore.exceptions import NotFound, ResponseException, Forbidden
@@ -93,7 +95,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
            authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
            reddit = authorizedInstance[0]
            refresh_token = authorizedInstance[1]
-            jsonFile(GLOBAL.configDirectory / "config.json").add({
+            jsonFile(GLOBAL.configDirectory).add({
                "reddit_username":str(reddit.user.me()),
                "reddit_refresh_token":refresh_token
            })
@@ -103,7 +105,7 @@ def beginPraw(config,user_agent = str(socket.gethostname())):
        authorizedInstance = GetAuth(reddit,port).getRefreshToken(*scopes)
        reddit = authorizedInstance[0]
        refresh_token = authorizedInstance[1]
-        jsonFile(GLOBAL.configDirectory / "config.json").add({
+        jsonFile(GLOBAL.configDirectory).add({
            "reddit_username":str(reddit.user.me()),
            "reddit_refresh_token":refresh_token
        })
@@ -422,18 +424,20 @@ def checkIfMatching(submission):
        eromeCount += 1
        return details

-    elif isDirectLink(submission.url) is not False:
-        details['postType'] = 'direct'
-        details['postURL'] = isDirectLink(submission.url)
-        directCount += 1
-        return details
-
    elif submission.is_self:
        details['postType'] = 'self'
        details['postContent'] = submission.selftext
        selfCount += 1
        return details

+    directLink = isDirectLink(submission.url)
+
+    if directLink is not False:
+        details['postType'] = 'direct'
+        details['postURL'] = directLink
+        directCount += 1
+        return details
+
 def printSubmission(SUB,validNumber,totalNumber):
    """Print post's link, title and media link to screen"""

@@ -473,7 +477,22 @@ def isDirectLink(URL):
        return URL

    elif "v.redd.it" in URL:
-        return URL+"/DASH_600_K"
+        bitrates = ["DASH_1080","DASH_720","DASH_600", \
+                    "DASH_480","DASH_360","DASH_240"]
+                    
+        for bitrate in bitrates:
+            videoURL = URL+"/"+bitrate
+
+            try:
+                responseCode = urllib.request.urlopen(videoURL).getcode()
+            except urllib.error.HTTPError:
+                responseCode = 0
+
+            if responseCode == 200:
+                return videoURL
+
+        else:
+            return False

    for extension in imageTypes:
        if extension in URL:
--- a/src/tools.py
+++ b/src/tools.py
@@ -14,7 +14,8 @@ class GLOBAL:
    config = None
    arguments = None
    directory = None
-    configDirectory = Path.home() / "Bulk Downloader for Reddit"
+    defaultConfigDirectory = Path.home() / "Bulk Downloader for Reddit"
+    configDirectory = ""
    reddit_client_id = "BSyphDdxYZAgVQ"
    reddit_client_secret = "bfqNJaRh8NMh-9eAr-t4TRz-Blk"
    printVanilla = print
Author	SHA1	Message	Date
Ali Parlakçı	d6194c57d9	Merge pull request #67 from dbanon87/dbanon87/erome-downloads fix erome download URLs	2019-10-17 09:51:02 +03:00
Ali Parlakçı	cd87a4a120	Merge pull request #68 from dbanon87/dbanon87/gitignore-env add env/ to gitignore	2019-10-17 09:49:36 +03:00
dbanon87	08cddf4c83	add env/ to gitignore This allows working in a virtualenv in the project directory.	2019-10-08 10:52:56 -04:00
dbanon87	88fa9e742d	fix erome download URLs	2019-10-08 10:51:56 -04:00
Ali Parlakçı	1c17f174a8	typo	2019-04-23 17:00:42 +03:00
Ali Parlakçı	9b36336ac3	typo	2019-04-23 16:51:05 +03:00
Ali Parlakçı	35e551f20c	Update README.md	2019-04-23 14:04:15 +03:00
Ali Parlakçı	0f2bda9c34	Merge pull request #63 from aliparlakci/moreUsefulReadme A more useful readme (credits to stared)	2019-04-23 14:00:53 +03:00
Ali Parlakçı	8ab694bcc1	Fixed typo	2019-04-23 13:59:01 +03:00
Ali	898f59d035	Added an FAQ entry	2019-04-23 13:51:21 +03:00
Ali	6b6db37185	Minor corrections	2019-04-23 13:29:58 +03:00
Piotr Migdał	d4a5100128	a clearer description how to run it (#62 )	2019-04-23 13:17:15 +03:00
Ali	22047338e2	Update version number	2019-04-09 20:45:22 +03:00
Ali	b16cdd3cbb	Hopefully, fixed the config.json bug	2019-04-09 20:31:42 +03:00
Ali	2a8394a48c	Fixed the bug concerning config.json	2019-04-08 22:09:52 +03:00
Ali Parlakçı	eac4404bbf	Update README.md	2019-03-31 11:59:49 +03:00
Ali Parlakci	fae49d50da	Update version	2019-03-31 11:46:03 +03:00
Ali Parlakci	7130525ece	Update version	2019-03-31 11:35:27 +03:00
Ali Parlakci	2bf1e03ee1	Update version	2019-03-31 11:33:29 +03:00
Ali	15a91e5784	Fixed saving auth info problem	2019-02-24 12:28:40 +03:00
Ali	344201a70d	Fixed v.redd.it links	2019-02-23 00:01:39 +03:00
Ali	92e47adb43	Update version	2019-02-22 23:59:57 +03:00
Ali	4d385fda60	Fixed v.redd.it links	2019-02-22 23:59:03 +03:00
Ali Parlakci	82dcd2f63d	Bug fix	2019-01-27 17:05:31 +03:00
Ali Parlakci	08de21a364	Updated Python3 version	2019-01-27 16:32:43 +03:00
Ali Parlakci	af7d3d9151	Moved FAQ	2019-01-27 16:32:00 +03:00