Scriptkid: 4月 2014

如何用圖片的md5去反查Gelbooru的相關資訊?

Gelbooru的預設檔名是: 'md5.ext'

這樣的檔名沒有包含多少有用的資訊。如何用md5去反查像: id, tags 之類的資訊？

URL = 'http://gelbooru.com/index.php?page=dapi&s=post&q=index&tags=md5%3a{0}'

其中{0}可以是:

md5
md5*(下載回來的檔名)

詳細參照: http://gelbooru.com/index.php?page=help&topic=cheatsheet

=============================================================
另外，yande.re也支援類似的查詢。

URL = 'https://yande.re/post.json?{0}'

參數:
------------------------------------------------------------
* limit=100 # result per request, maximum is 100

* page=1 # index started at 1

* tags='' # result filter by tags

其中，tags=id:{post_id}

參照: https://yande.re/help/api

#!/usr/bin/env python
#-*- coding:utf-8 -*-

import os
import urllib2
import sys
from BeautifulSoup import BeautifulSoup

try:
    url = sys.argv[1]
except IndexError:
    print "geh - A simeple g.e-hentai downloader\nUsage: geh.py [url]"
        

def find_next_page_link(tag):
    try:
        if tag.name == 'a' and tag.text == '>': return True
        return False
    except TypeError:
        return False 
               

def find_image(html):
    soup = BeautifulSoup(html)
    return soup.find('img', {'id': 'img'})['src']
    

def parse_index(url):    
    html = urllib2.urlopen(url).read()
    soup = BeautifulSoup(html)

    title = soup.h1.text
    next_page_url = soup.find(find_next_page_link)
    image_list = [node['href'] for node in soup.find('div', {'id': 'gdt'}).findAll('a')]

    return title, next_page_url, image_list

c = 1
while True:    
    title, next_page_url, image_list = parse_index(url)

    dst_dir = title
    if not os.path.exists(dst_dir):
        os.mkdir(dst_dir)
    
    for page in image_list:        
        html = urllib2.urlopen(page).read()
        image_url = find_image(html)

        fn = '{0}.{1}'.format(str(c).zfill(3), image_url.split('.')[-1])
        fn = os.path.join(dst_dir, fn)
        
        print '{0}: {1} ... '.format(dst_dir, str(c).zfill(3)),
        with file(fn, 'wb') as f:
            image_data = urllib2.urlopen(image_url).read()
            f.write(image_data)
        print 'done'            
        c += 1

    if not next_page_url:
        break
    
    url = next_page_url['href']

Scriptkid

如何用圖片的md5去反查Gelbooru的相關資訊?

簡易的 flvxz.com 解析工具

將 .gjots 轉換成 .epub 的小工具

A simple g.e-hentai downloader