announce_url in bt_torrents table not updated/inserted (bt_torrent/bt_torrent.module)

hapydoyzer - April 27, 2009 - 06:24
Project:BitTorrent
Version:6.x-9.x-dev
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Description

First, sorry for my bad English.

File bt_torrent/bt_torrent.module
When executing hook_insert and hook_update, announce_url field (bt_torrents table) are not present in INSERT/UPDATE queres. This result in blank announce_url fields in all torrents.

Blank announce_url results the error inside hook_cron on this line:
foreach ($files as $info_hash => $scrape_info) {
Because $scrape_info are blank.

I`m added simple NOT (announce_url='') logic in query to filter out unwanted torrents.
But I think hook_cron need more testing before above foreach cycle execution.
Network errors may result in incorrect(blank?) return from the curl; incorrect ANNOUNCE URL may raise this error too.

P.S. Code in hook_nodeapi may need corrections too, but I`m not touching it.

AttachmentSize
bittorrent-add-missing-announce-url.patch1.99 KB

#1

hapydoyzer - May 15, 2009 - 11:18

This issue is NOT fixed in latest CVS (currently commit #211946) as I can see looking throw code.

#2

overall - May 15, 2009 - 13:57

Attached patch does not support "announce-list".

As I can suppose bradfordcp has no free time now for discussing some issues.
This is the cause of delay.
At the moment I have no free time too, for week or less...
So, hapydoyzer, if you are interested in module development ASAP you can help with development, but before to do this we need to choose the best way of implementation.

Currently I skyping & emailing also with RoboPhred ( http://drupal.org/user/131239 ) who is also interested in this module.

I'd like to offer to use "announce_url" field to store bencoded "announce-list" section or "announce".
But we also need to rewrite scrape function to support "announce-list" - http://bittorrent.org/beps/bep_0012.html

#3

hapydoyzer - May 16, 2009 - 17:44
Status:needs review» needs work

I think, it is better to not store announce list as bencoded data.
When we will do massive scrape from other trackers we will need to bdecode each announce_url field...

I think we need to create separate table for this. One row - one announce url and info hash.
If we do it like this, we can do simple SELECT DISTINCT announce_url to retrieve list of trackers whoes need scraping.

#4

overall - May 17, 2009 - 01:46

Maybe that would be a better way.
But we must to take into account that "announce-list" it's not just list of announce url, it's list of tiers of announce url.
So table structure will be [info_hash, tier_index (int), announce_url].
Or we can use two table: bt_trackers [trid, announce_url] and bt_announces [trid, info_hash, tier_index (int)].
Into "bt_trackers" table we can add additional field: "weight" (int) - to sort trackers inside the tier (all tiers actually) (see http://bittorrent.org/beps/bep_0012.html ).

All would be ok, but in case:

"a" - announce url

info_hash_1: [[a1], [a2, a3]],
info_hash_2: [[a2], [a1, a3]],
info_hash_3: [[a3], [a1, a2]],

"weight" field and therefore "two table"-struture become unsuitable...

So there is no way to simply implement "announce-list" following http://bittorrent.org/beps/bep_0012.html standards.

Now I rereaded http://bittorrent.org/beps/bep_0012.html one more time and take into incomprehension.
According to this "announce-list" has following struture: [ [ tracker1, tracker2 ], [backup1], [backup2, backup3] ] which is not understandable for me.
But according to uTorrent client, it use "announce-list" as it would has following structure: [ [ tracker1, tracker1_backup1, tracker1_backup2 ], [ tracker2, tracker2_backup1, tracker2_backup2 ], [ tracker3, tracker3_backup1, tracker3_backup2 ] ] which is clear for me.

#5

overall - May 19, 2009 - 14:24

There is one ambiguity concerning of storing "announce_url" in cases when announce URL itself contain passkey.
And therefore can be several announce URLs with same domen name (server) but with different passkeys.
But AFAIK there is no need for passkey for scraping a tracker.
So if data from table with announce URLs will be used only for scraping, therefore we can remove passkey param from announce URL before adding it to table.

#6

hapydoyzer - May 19, 2009 - 19:13

I don`t see actual difference between
[ [ tracker1, tracker2 ], [backup1], [backup2, backup3] ]
and
[ [ tracker1, tracker1_backup1, tracker1_backup2 ], [ tracker2, tracker2_backup1, tracker2_backup2 ], [ tracker3, tracker3_backup1, tracker3_backup2 ] ]

In the first example [backup1] and [backup2, backup3] will never be used if one of [ tracker1, tracker2 ] works.

I think we can implement scraping in three ways:

  1. Scrape only tracker specified in announce (not announce-list) field. This is simple and don`t require additional tables or fields.
  2. Scrape all trackers. This is also simple. This way we can catch ALL peers of the torrent. But this needs more HTTP-requests. Announce URLs are stored in additional table with info_hash field.
  3. Select best tier of trackers. Do things like ordinary torrent-clients does. We need to respect order and structure of announce-url and select working announce tier for scraping according to BEP0012.

About third:

If all torrent clients do things like specification said, almost all peers would be on one tier of trackers. But I don`t know it is true or false in real world.

And I think we can implement more than one scraping strategy to let administrator to choose between them and select the best.

#7

hapydoyzer - May 19, 2009 - 19:32
Status:needs work» active

> But AFAIK there is no need for passkey for scraping a tracker.

I think uploading of torrents with passkey for trackers different from our is stupid, because user compromise his own passkey (for different tracker).

Maybe we need to warn user on uploading such torrent? Or silently strip passkey= portion from announce_url? Or maybe remove this announce_url completely, because target tracker probably don`t accept anonymous (without passkey) requests to /announce?

#8

overall - May 20, 2009 - 08:38

The difference between
[ [ tracker1, tracker2 ], [backup1], [backup2, backup3] ]
and
[ [ tracker1, tracker1_backup1, tracker1_backup2 ], [ tracker2, tracker2_backup1, tracker2_backup2 ], [ tracker3, tracker3_backup1, tracker3_backup2 ] ]
is:
First case function acually as you (and probably BEP0012) says.
Second case is what actually uTorrent does.
At first I must say that uTorrent support several trackers at the same time.
For example: http://thepiratebay.org/torrent/3933809/Queen_discography_(MP3_320Kbps)

It has following structure:

...
    [announce] => http://tracker.thepiratebay.org/announce
    [announce-list] => Array
        (
            [0] => Array
                (
                    [0] => http://tracker.thepiratebay.org/announce
                )

            [1] => Array
                (
                    [0] => udp://tracker.thepiratebay.org:80/announce
                )

            [2] => Array
                (
                    [0] => http://www.h33t.com:3310/announce
                )

        )
...

i.e. three tiers each with one tracker in it.
And when I start downloading this torrent in uTorrent, it operates with all three trackers simultaneously (except udp://tracker.thepiratebay.org:80/announce, because this one just doesn't work).

So therefore I can suppose thar there is the difference between BEP0012 and that what uTorrent actually do.

#9

overall - May 20, 2009 - 08:46

> I think we can implement scraping in three ways:
1. It's bad variant because it not support "announce-list", so you know it.
2. It's wrong way because several of trackers is just backup servers which acually operate on one DB. And if we will sum their info, this will be wrong.
3. Right way, but see my prev post.

> If all torrent clients do things like specification said, almost all peers would be on one tier of trackers. But I don`t know it is true or false in real world.
See my prev post.

> And I think we can implement more than one scraping strategy to let administrator to choose between them and select the best.
I think there is only one right strategy and we must implement it.

#10

overall - May 20, 2009 - 08:57

> I think uploading of torrents with passkey for trackers different from our is stupid, because user compromise his own passkey (for different tracker).
Maybe this is stupid, but many things in the Internet based on trust to sites. It's depends on actual sites.

This is usefull when uploading torrent-file downloaded from another (in our case, private) tracker to our tracker.
User just can't remove passkey without using special tools, so one what he can do is upload it as is.

> Maybe we need to warn user on uploading such torrent? Or silently strip passkey= portion from announce_url? Or maybe remove this announce_url completely, because target tracker probably don`t accept anonymous (without passkey) requests to /announce?
I think at first time we can just implement all variants, with option in admin section. Or ask user to choose.

#11

overall - May 20, 2009 - 09:00

hapydoyzer, can you explain to me in russian by email what client must actually do accodring to BEP0012 ?
Because I don't fully undestand this.

 
 

Drupal is a registered trademark of Dries Buytaert.