1
0

Select /html/body/ #38

Open
opened 2014-02-24 23:13:52 +00:00 by lbell · 5 comments
lbell commented 2014-02-24 23:13:52 +00:00 (Migrated from github.com)

Any way to select an entire body of a page? I'm working on one that has no div or classes or much of anything except text wrapped in a body tag.

Any way to select an entire body of a page? I'm working on one that has no div or classes or much of anything except text wrapped in a body tag.
ravenx99 commented 2014-06-12 15:35:09 +01:00 (Migrated from github.com)

I'm in the same boat... I want to pull a comic image from a page that has no classes. This doesn't seem to work (Firebug says it's the xpath to the image tag).

"xpath": "html/body/table/tbody/tr[2]/td/table/tbody/tr/td[2]/table/tbody/tr/td/table/tbody/tr[2]/td/img"

I'm in the same boat... I want to pull a comic image from a page that has no classes. This doesn't seem to work (Firebug says it's the xpath to the image tag). "xpath": "html/body/table/tbody/tr[2]/td/table/tbody/tr/td[2]/table/tbody/tr/td/table/tbody/tr[2]/td/img"
m42e commented 2014-07-25 04:07:27 +01:00 (Migrated from github.com)

Can you provide a sample url?

Can you provide a sample url?
troydunham commented 2014-08-11 18:11:29 +01:00 (Migrated from github.com)

I'm having a similar issue with a page without usable DIV classes. I have found a unique locator but can't seem to get it to pull body text.

Here is an example page: http://paddocktalk.com/news/html/story-259326.html
Here is the unique string: "IMG SRC=http://paddocktalk.com/news/html/images/smilies/icon_smile.gif"
I've tried all variations of XPATH that I can think of. My other pages with div classes are working perfectly.

I'm having a similar issue with a page without usable DIV classes. I have found a unique locator but can't seem to get it to pull body text. Here is an example page: http://paddocktalk.com/news/html/story-259326.html Here is the unique string: "IMG SRC=http://paddocktalk.com/news/html/images/smilies/icon_smile.gif" I've tried all variations of XPATH that I can think of. My other pages with div classes are working perfectly.
m42e commented 2014-08-11 19:02:16 +01:00 (Migrated from github.com)

It seems to me, that it is not a proper xml format, maybe some tags are missing or there is no encoding specified so some characters can not be read successfully. This will lead to errors and the xpath selection is not performed. You can try my version and use the split method. https://github.com/m42e/ttrss_plugin-af_feedmod

It seems to me, that it is not a proper xml format, maybe some tags are missing or there is no encoding specified so some characters can not be read successfully. This will lead to errors and the xpath selection is not performed. You can try my version and use the split method. https://github.com/m42e/ttrss_plugin-af_feedmod
m42e commented 2014-08-11 22:22:24 +01:00 (Migrated from github.com)

Ok, i digged in deeper.

@troydunham try: "xpath" : "td[@width='85%' and @valign='top' and @bgcolor='#FFFFFF']" and you may be near the treasure.....

Ok, i digged in deeper. @troydunham try: "xpath" : "td[@width='85%' and @valign='top' and @bgcolor='#FFFFFF']" and you may be near the treasure.....
This repo is archived. You cannot comment on issues.
No Label
1 Participants
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mbirth/ttrss_plugin-af_feedmod#38
No description provided.