mirror of
https://github.com/KevinMidboe/bulk-downloader-for-reddit.git
synced 2026-01-18 15:16:09 +00:00
Compare commits
14 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
cc93aa3012 | ||
|
|
50c4a8d6d7 | ||
|
|
5737904a54 | ||
|
|
f6eba6c5b0 | ||
|
|
41cbb58db3 | ||
|
|
c569124406 | ||
|
|
1a3836a8e1 | ||
|
|
fde6a1fac4 | ||
|
|
6bba2c4dbb | ||
|
|
a078d44236 | ||
|
|
deae0be769 | ||
|
|
3cf0203e6b | ||
|
|
0b31db0e2e | ||
|
|
d3f2b1b08e |
23
README.md
23
README.md
@@ -6,7 +6,7 @@ This program downloads imgur, gfycat and direct image and video links of saved p
|
|||||||
## What it can do
|
## What it can do
|
||||||
- Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
|
- Can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
|
||||||
- Sorts posts by hot, top, new and so on
|
- Sorts posts by hot, top, new and so on
|
||||||
- Downloads imgur albums, gfycat links, [self posts](#i-cant-open-the-self-post-files) and any link to a direct image
|
- Downloads imgur albums, gfycat links, [self posts](#how-do-i-open-self-post-files) and any link to a direct image
|
||||||
- Skips the existing ones
|
- Skips the existing ones
|
||||||
- Puts post titles to file's name
|
- Puts post titles to file's name
|
||||||
- Puts every post to its subreddit's folder
|
- Puts every post to its subreddit's folder
|
||||||
@@ -19,12 +19,12 @@ This program downloads imgur, gfycat and direct image and video links of saved p
|
|||||||
## How it works
|
## How it works
|
||||||
|
|
||||||
- For **Windows** and **Linux** users, there are executable files to run easily without installing a third party program. But if you are a paranoid like me, you can **[compile it from source code](docs/COMPILE_FROM_SOURCE.md)**.
|
- For **Windows** and **Linux** users, there are executable files to run easily without installing a third party program. But if you are a paranoid like me, you can **[compile it from source code](docs/COMPILE_FROM_SOURCE.md)**.
|
||||||
- In Windows, double click on script.exe file
|
- In Windows, double click on bulk-downloader-for-reddit file
|
||||||
- In Linux, extract files to a folder and open terminal inside it. Type **`./script`**
|
- In Linux, extract files to a folder and open terminal inside it. Type **`./bulk-downloader-for-reddit`**
|
||||||
|
|
||||||
- **MacOS** users have to **[compile it from source code](docs/COMPILE_FROM_SOURCE.md)**.
|
- **MacOS** users have to **[compile it from source code](docs/COMPILE_FROM_SOURCE.md)**.
|
||||||
|
|
||||||
Script also accepts **command-line arguments**, get further information from **[`python script.py --help`](docs/COMMAND_LINE_ARGUMENTS.md)**
|
Script also accepts **command-line arguments**, get further information from **[`--help`](docs/COMMAND_LINE_ARGUMENTS.md)**
|
||||||
|
|
||||||
## Setting up the script
|
## Setting up the script
|
||||||
Because this is not a commercial app, you need to create an imgur developer app in order API to work.
|
Because this is not a commercial app, you need to create an imgur developer app in order API to work.
|
||||||
@@ -42,11 +42,22 @@ It should redirect to a page which shows your **imgur_client_id** and **imgur_cl
|
|||||||
\* Select **OAuth 2 authorization without a callback URL** first then select **Anonymous usage without user authorization** if it says *Authorization callback URL: required*
|
\* Select **OAuth 2 authorization without a callback URL** first then select **Anonymous usage without user authorization** if it says *Authorization callback URL: required*
|
||||||
|
|
||||||
## FAQ
|
## FAQ
|
||||||
### I can't open the self post files.
|
### How do I open self post files?
|
||||||
- Self posts are held at reddit as styled with markdown. So, the script downloads them as they are in order not to lose their stylings.
|
- Self posts are held at reddit as styled with markdown. So, the script downloads them as they are in order not to lose their stylings.
|
||||||
However, there is a [great Chrome extension](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with [Chrome](https://www.google.com/intl/tr/chrome/).
|
However, there is a [great Chrome extension](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with [Chrome](https://www.google.com/intl/tr/chrome/).
|
||||||
|
|
||||||
|
However, they are basically text files. You can also view them with any text editor such as Notepad on Windows, gedit on Linux or Text Editor on MacOS
|
||||||
|
|
||||||
|
### How can I change my credentials?
|
||||||
|
- All of the user data is held in **config.json** file which is in a folder named "Bulk Downloader for Reddit" in your **Home** directory. You can edit
|
||||||
|
them, there.
|
||||||
|
|
||||||
## Changelog
|
## Changelog
|
||||||
|
### [19/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/50c4a8d6d7e21d9b44a6d6d00c1811cfe9c655b1)
|
||||||
|
- Added v.redd.it support
|
||||||
|
- Added custom exception descriptions to FAILED.json file
|
||||||
|
- Fixed the bug that prevents downloading some gfycat URLs
|
||||||
|
|
||||||
### [13/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/9f831e1b784a770c82252e909462871401a05c11)
|
### [13/07/2018](https://github.com/aliparlakci/bulk-downloader-for-reddit/tree/9f831e1b784a770c82252e909462871401a05c11)
|
||||||
- Change config.json file's path to home directory
|
- Change config.json file's path to home directory
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +0,0 @@
|
|||||||
theme: jekyll-theme-minimal
|
|
||||||
show_downloads: false
|
|
||||||
#title: Bulk Downloader for Reddit
|
|
||||||
description: Code written by Ali PARLAKCI
|
|
||||||
google_analytics: UA-80780721-3
|
|
||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
See **[compiling from source](COMPILE_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](COMPILE_FROM_SOURCE.md#using-terminal) and come back.
|
See **[compiling from source](COMPILE_FROM_SOURCE.md)** page first unless you are using an executable file. If you are using an executable file, see [using terminal](COMPILE_FROM_SOURCE.md#using-terminal) and come back.
|
||||||
|
|
||||||
***Use*** `.\script.exe` ***or*** `./script` ***if you are using the executable***.
|
***Use*** `.\bulk-downloader-for-reddit.exe` ***or*** `./bulk-downloader-for-reddit` ***if you are using the executable***.
|
||||||
```console
|
```console
|
||||||
$ python script.py --help
|
$ python script.py --help
|
||||||
usage: script.py [-h] [--directory DIRECTORY] [--link link] [--saved]
|
usage: script.py [-h] [--directory DIRECTORY] [--link link] [--saved]
|
||||||
@@ -50,7 +50,7 @@ python script.py
|
|||||||
```
|
```
|
||||||
|
|
||||||
```console
|
```console
|
||||||
.\script.exe
|
.\bulk-downloader-for-reddit.exe
|
||||||
```
|
```
|
||||||
|
|
||||||
```console
|
```console
|
||||||
@@ -58,11 +58,11 @@ python script.py
|
|||||||
```
|
```
|
||||||
|
|
||||||
```console
|
```console
|
||||||
.\script.exe -- directory .\\NEW_FOLDER --search cats --sort new --time all --subreddit gifs pics --NoDownload
|
.\bulk-downloader-for-reddit.exe -- directory .\\NEW_FOLDER --search cats --sort new --time all --subreddit gifs pics --NoDownload
|
||||||
```
|
```
|
||||||
|
|
||||||
```console
|
```console
|
||||||
./script --directory .\\NEW_FOLDER\\ANOTHER_FOLDER --saved --limit 1000
|
./bulk-downloader-for-reddit --directory .\\NEW_FOLDER\\ANOTHER_FOLDER --saved --limit 1000
|
||||||
```
|
```
|
||||||
|
|
||||||
```console
|
```console
|
||||||
|
|||||||
@@ -15,7 +15,7 @@ Latest* version of **Python 3** is needed. See if it is already installed [here]
|
|||||||
- **On MacOS**: Look for an app called **Terminal**.
|
- **On MacOS**: Look for an app called **Terminal**.
|
||||||
|
|
||||||
### Navigating to the directory where script is downloaded
|
### Navigating to the directory where script is downloaded
|
||||||
Go inside the folder where script.py is located. If you are not familier with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)
|
Go inside the folder where script.py is located. If you are not familiar with changing directories on command-prompt and terminal read *Changing Directories* in [this article](https://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything)
|
||||||
|
|
||||||
## Finding the correct keyword for Python
|
## Finding the correct keyword for Python
|
||||||
Enter these lines to terminal window until it prints out the version you have downloaded and installed:
|
Enter these lines to terminal window until it prints out the version you have downloaded and installed:
|
||||||
|
|||||||
@@ -22,7 +22,7 @@ from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,
|
|||||||
|
|
||||||
__author__ = "Ali Parlakci"
|
__author__ = "Ali Parlakci"
|
||||||
__license__ = "GPL"
|
__license__ = "GPL"
|
||||||
__version__ = "1.1.2"
|
__version__ = "1.2.0"
|
||||||
__maintainer__ = "Ali Parlakci"
|
__maintainer__ = "Ali Parlakci"
|
||||||
__email__ = "parlakciali@gmail.com"
|
__email__ = "parlakciali@gmail.com"
|
||||||
|
|
||||||
@@ -246,7 +246,6 @@ class PromptUser:
|
|||||||
# DELETE THE PLUS (+) AT THE END
|
# DELETE THE PLUS (+) AT THE END
|
||||||
GLOBAL.arguments.subreddit = GLOBAL.arguments.subreddit[:-1]
|
GLOBAL.arguments.subreddit = GLOBAL.arguments.subreddit[:-1]
|
||||||
|
|
||||||
print(GLOBAL.arguments.subreddit)
|
|
||||||
print("\nselect sort type:")
|
print("\nselect sort type:")
|
||||||
sortTypes = [
|
sortTypes = [
|
||||||
"hot","top","new","rising","controversial"
|
"hot","top","new","rising","controversial"
|
||||||
@@ -539,7 +538,7 @@ def download(submissions):
|
|||||||
downloadedCount -= 1
|
downloadedCount -= 1
|
||||||
|
|
||||||
except NotADownloadableLinkError as exception:
|
except NotADownloadableLinkError as exception:
|
||||||
print("Could not read the page source")
|
print(exception)
|
||||||
FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]})
|
FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]})
|
||||||
downloadedCount -= 1
|
downloadedCount -= 1
|
||||||
|
|
||||||
|
|||||||
@@ -36,7 +36,10 @@ def getExtension(link):
|
|||||||
if TYPE in parsed:
|
if TYPE in parsed:
|
||||||
return "."+parsed[-1]
|
return "."+parsed[-1]
|
||||||
else:
|
else:
|
||||||
return '.jpg'
|
if not "v.redd.it" in link:
|
||||||
|
return '.jpg'
|
||||||
|
else:
|
||||||
|
return '.mp4'
|
||||||
|
|
||||||
def getFile(fileDir,tempDir,imageURL,indent=0):
|
def getFile(fileDir,tempDir,imageURL,indent=0):
|
||||||
"""Downloads given file to given directory.
|
"""Downloads given file to given directory.
|
||||||
@@ -169,7 +172,9 @@ class Imgur:
|
|||||||
if duplicates == imagesLenght:
|
if duplicates == imagesLenght:
|
||||||
raise FileAlreadyExistsError
|
raise FileAlreadyExistsError
|
||||||
elif howManyDownloaded < imagesLenght:
|
elif howManyDownloaded < imagesLenght:
|
||||||
raise AlbumNotDownloadedCompletely
|
raise AlbumNotDownloadedCompletely(
|
||||||
|
"Album Not Downloaded Completely"
|
||||||
|
)
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def initImgur():
|
def initImgur():
|
||||||
@@ -217,9 +222,9 @@ class Gfycat:
|
|||||||
try:
|
try:
|
||||||
POST['mediaURL'] = self.getLink(POST['postURL'])
|
POST['mediaURL'] = self.getLink(POST['postURL'])
|
||||||
except IndexError:
|
except IndexError:
|
||||||
raise NotADownloadableLinkError
|
raise NotADownloadableLinkError("Could not read the page source")
|
||||||
except Exception as exception:
|
except Exception as exception:
|
||||||
raise NotADownloadableLinkError
|
raise NotADownloadableLinkError("Could not read the page source")
|
||||||
|
|
||||||
POST['postExt'] = getExtension(POST['mediaURL'])
|
POST['postExt'] = getExtension(POST['mediaURL'])
|
||||||
|
|
||||||
@@ -248,8 +253,7 @@ class Gfycat:
|
|||||||
if url[-1:] == '/':
|
if url[-1:] == '/':
|
||||||
url = url[:-1]
|
url = url[:-1]
|
||||||
|
|
||||||
if 'gifs' in url:
|
url = "https://gfycat.com/" + url.split('/')[-1]
|
||||||
url = "https://gfycat.com/" + url.split('/')[-1]
|
|
||||||
|
|
||||||
pageSource = (urllib.request.urlopen(url).read().decode().split('\n'))
|
pageSource = (urllib.request.urlopen(url).read().decode().split('\n'))
|
||||||
|
|
||||||
@@ -266,7 +270,7 @@ class Gfycat:
|
|||||||
break
|
break
|
||||||
|
|
||||||
if "".join(link) == "":
|
if "".join(link) == "":
|
||||||
raise NotADownloadableLinkError
|
raise NotADownloadableLinkError("Could not read the page source")
|
||||||
|
|
||||||
return "".join(link)
|
return "".join(link)
|
||||||
|
|
||||||
|
|||||||
@@ -397,8 +397,9 @@ def checkIfMatching(submission):
|
|||||||
imgurCount += 1
|
imgurCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
elif isDirectLink(submission.url):
|
elif isDirectLink(submission.url) is not None:
|
||||||
details['postType'] = 'direct'
|
details['postType'] = 'direct'
|
||||||
|
details['postURL'] = isDirectLink(submission.url)
|
||||||
directCount += 1
|
directCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
@@ -435,7 +436,7 @@ def printSubmission(SUB,validNumber,totalNumber):
|
|||||||
|
|
||||||
def isDirectLink(URL):
|
def isDirectLink(URL):
|
||||||
"""Check if link is a direct image link.
|
"""Check if link is a direct image link.
|
||||||
If so, return True,
|
If so, return URL,
|
||||||
if not, return False
|
if not, return False
|
||||||
"""
|
"""
|
||||||
|
|
||||||
@@ -444,10 +445,13 @@ def isDirectLink(URL):
|
|||||||
URL = URL[:-1]
|
URL = URL[:-1]
|
||||||
|
|
||||||
if "i.reddituploads.com" in URL:
|
if "i.reddituploads.com" in URL:
|
||||||
return True
|
return URL
|
||||||
|
|
||||||
|
elif "v.redd.it" in URL:
|
||||||
|
return URL+"/DASH_600_K"
|
||||||
|
|
||||||
for extension in imageTypes:
|
for extension in imageTypes:
|
||||||
if extension in URL:
|
if extension in URL:
|
||||||
return True
|
return URL
|
||||||
else:
|
else:
|
||||||
return False
|
return False
|
||||||
|
|||||||
Reference in New Issue
Block a user