1
0

suggestion: URL_REWRITE Type #6

Open
opened 2013-05-16 17:19:05 +01:00 by hnz101 · 2 comments
hnz101 commented 2013-05-16 17:19:05 +01:00 (Migrated from github.com)

Currently I try to make a good xpath extract for a local newspaper website, but their style has pretty many unnecessary stuff inside and no single div tag or something for the pure article text.

My suggestion for cases like this would be some url rewrite feature to fetch the print version instead of the normal article version.
A Simple regex rewrite for the url and it could fetch a very slime and clean version of the article.

Currently I try to make a good xpath extract for a local newspaper website, but their style has pretty many unnecessary stuff inside and no single div tag or something for the pure article text. My suggestion for cases like this would be some url rewrite feature to fetch the print version instead of the normal article version. A Simple regex rewrite for the url and it could fetch a very slime and clean version of the article.
mbirth commented 2013-06-20 11:05:35 +01:00 (Migrated from github.com)

This sounds good, I'll look into it when I find some time.

This sounds good, I'll look into it when I find some time.
oscar-b commented 2013-07-08 07:57:46 +01:00 (Migrated from github.com)

You could probably use ff_FeedCleaner for this though?

You could probably use ff_FeedCleaner for this though?
This repo is archived. You cannot comment on issues.
No Label
1 Participants
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: mbirth/ttrss_plugin-af_feedmod#6
No description provided.