no related content
toma - August 25, 2008 - 10:22
| Project: | Memetracker |
| Version: | 6.x-1.1-alphpa5 |
| Component: | Code |
| Category: | support request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | active |
Description
Hi
I installed the module, and it work just fine, but i notice some duplicate content, and no related content as i can see in your demo, i launch my french web site at www.biladi.info
Thanks

#1
I would say it's probally because you didn't currectly install the python dependencies, I had a similar problem http://drupal.org/node/290298
#2
I contact the server administrator and everything its installed correctly, what i have to type to see if its installed correctly, my server is CENTOS Enterprise 4.6 i686
#3
On the command line type python to enter the python interactive interpreter.
Then there type "from Pycluster import *"
If that doesn't fail, then Pycluster *should* be installed correctly.
#4
Thanks for your reply, thats what i get :
[root@server ~]# pythonPython 2.4.3 (#1, Feb 23 2008, 08:24:54)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Pycluster import *
>>>
#5
It seems that Pycluster and Numeric then are installed correctly.
If your still not seeing clusters there are a few other possible reasons. Try keep running cron multiple times. It can take many cron runs to download new content from all your feeds (handy tip -- navigate to /admin/content/feed/list -- there you can see the last time each feed was updated).
Another possible problem is there might just not be any memes. If you have too few feeds and/or content from your feeds are unrelated, then memetracker will (correctly) not display any feeds.
#6
I set cron to run every 10 minutes, and i am sure some articles have the same title and content,with no related, i can give admin access to my web site to see, if you want to take a look.
#7
That might speed up debugging. Could I also have access to your database through phpmyadmin or something? That'd be really helpful as well.
#8
Hi, thanks for your help, i just contact you by drupal contact form, all the information you need
#9
Huh. Everything looks fine in your database and on your website. Copy the attached file to your webserver, remove the "_.txt" extension and run
python cluster.pyThis file is an exact copy of the python script in Memetracker that does the interfaces with the clustering library except it will use dummy data rather than live data from Memetracker. If this script runs properly, then Pycluster is installed correctly and something else in the code is wrong. If it fails, then either Pycluster or Python Numeric is installed incorrectly.
The output should be:
Cluster output:9,8,0.0859780715517;3,2,0.121742078872;6,1,0.235191106376;10,-2,0.474667173546;-1,4,0.514123907524;7,-3,0.550835588735;
#10
Hi this what i get
[root@server ~]# python cluster.pyCluster output:
9,8,0.0859780715517;3,2,0.121742078872;6,1,0.235191106376;10,-2,0.474667173546;- 1,4,0.514123907524;7,-3,0.550835588735;
#11
Weird. It seems everything seems to be working correctly. . . I'm confused now as to what the problem could be. I'm going to be flying back to the states tomorrow. In a few days I'll have time to investigate deeper to where in the code the problem could be.
#12
Thanks for your help
#13
Toma,
I think this is why you're having trouble. It seems that Pycluster has changed it's dependency from Python-Numeric to Python-Numpy. See this issue: http://drupal.org/node/285854#comment-1008043
Try installing Python-numpy and tell me if that fixes the issue.
#14
Hi. I'm having a similar issue. (python related im sure as this is a new box). Both pycluster and numpy have been installed. I have a feeling its a simple fix, however I am at a loss. When I do the first pycluster test command, I get:
xorsyst@ubuntu:~/public_html$ python
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Pycluster import *
>>>
However, when I run the test cluster.py script from above, I get:
xorsyst@ubuntu:~/public_html$ python cluster.py
Traceback (most recent call last):
File "cluster.py", line 39, in
y = array(p,'d')
NameError: name 'array' is not defined
No idea where to go. I'm curious, does memetracker still work with Pycluster 1.42 and Numpy? I am trying to find Pycluster 1.41 to see if its just a problem with the new version.
#15
I found Pycluster 1.41, installed it and Numeric, and I still got the same error. However if i change the array line to:
y = Numeric.array(p,'d')
cluster.py ran fine. So I changed the line in memetracker's cluster.py and the clustering of memes worked fine as well.
#16
Here's the solution actually to the problem. The new Pycluster requires we import numpy in the memetracker script. Add
import numpyto the top of your script and it should work. I need to get a new version of Memetracker out there with the change.#17
There is no need to import numpy - because we are already using from Pycluster import * to import all packages
Changing
y = array(p,'d')
to
y = numpy.array(p,'d')
Working for me.
I am using
numpy-1.2.0
Pycluster-1.43
Thanks
Vinay
#18
Hi all.
I've been playing with Memetracker for the last few days, and I'm not seeing related content either.
I have run the cluster.py_.txt test, and all is good on the Python installation.
I changed the array line in cluster.py in the memtracker module.
I added 'import numpy' to cluster.py.
still no related content.
I am using numpy-1.2.0 and Pycluster-1.43.
Thanks!
#19
Whoops, I gave the wrong import statement above. Try
from numpy import *instead ofimport numpyin cluster.py.#20
Thanks Kyle. I made this change, but still no luck.
Is there any other kind of test I can run to try to pinpoint the problem?
Thanks for your help!
#21
Check if your file system in Drupal is working. The clustering won't work with out the file directory (data used in clustering is stored there for the Python script to read in).
#22
Hey Kyle :
Here is the File system info from 'admin/reports/status':
Writable (public download method)
#23
Well -- it's not that then. . . :) Do you mind giving me admin access to your site for a bit so I can look around? Just email me through my contact form.
#24
What should we be seeing if it is working properly? I just get a list of fed items at memetracker/1 no real way to know what is related to what, if anything (can provide screenshot if needed)
#25
I have also installed numpy, added 'from numpy import *' and 'y = numpy.array(p,'d')'. The example cluster.py_.txt file works for me withthe desired output (after the changes) but still there is no clustering.
#26
dejbar -- sorry for not getting back to you earlier. Are you still having troubles with clustering?
#27
I am also running into the same issue. I see no clustering. here's the output from the test script,
python cluster.py
Cluster output:
9,8,0.0859780715517;3,2,0.121742078872;6,1,0.235191106376;10,-2,0.474667173546;-1,4,0.514123907524;7,-3,0.550835588735;
How can I debug this issue?
#28
I believe I'm having a similar issue: all 'appears' to be working (following Ubuntu Python instructions) but I can't tell what content is related to what other content. By lowering the pickiness, running cron and refreshing I can see a list of memes on the /memetracker/1 page, but I'm clueless as to how memetracker is displaying like-content.
Does there exist a step-by step config/admin or user guide? How about a bare-bones fresh D6 install with the required modules installed that we could take a peek at to establish just what we should be seeing?
Thanks for all of the work to date done on this module...I know it'll be supremely handy!
#29
Same issue here. Ran the test (after Kyle's help here: http://drupal.org/node/333326 -- thanks!) and got the correct result after running cluster_test.py.
But clusters aren't appearing in memebrowser/1. I first ran cron about three or four times. Then I tried changing the pickiness setting down from 90 to 10, and increasing the number of memes from 10 to 15. There were five new memes added, but no change in the existing order, and still no clustering.
I then added about five more feeds (total now 15, with 159 items being brought in by Feedapi each time I run cron) which reshuffled the items on memebrowser/1 and added about seven or eight new memes. No clustering still, and two memes (one in the #1 position, and one in the number 5 position) were covering the same thing -- i.e., they should have clustered, but didn't Raised the pickiness back to 90, and ran cron yet again, but still no change.
What should I try next? I'd really love to get this working ....
Thanks
Ian
#30
Ian - can you post a screenshot of what you're seeing?
#31
Hello Kyle. Screenshot attached. The names running horizontally across the top of the screen are the sources I am using -- mostly mainstream business news sites. The five memes include a duplicate topic in the #1 and #5 slots -- the NYT and WSJ articles about Rattner stepping down.
I can include screenshots of anything else that might help you diagnose the problem, too ... just let me know.
Thanks
Ian
#32
I am also to the point where http://drupal.org/node/299632#comment-1812382 above is. Screenshot represents my output as well
#33
file that memetracker is writing to files dir is attached
#34
after figuring out that I need mysql5 I got that going on another server and now memetracker_data.txt does indeed have soem data
But, now I have Fatal error: Call to a member function get_timestamp() on a non-object in /var/www/mysite/sites/all/modules/memetracker/machine_learning_api.inc on line 585
So, I will dig into machinelearning api