Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Save Information to DB at Crawling
mich dobelman wrote:
> I am trying to make a crawling program to grab information and store them
> into the database.
> The web site is structured as following.
>
> REGION
> CATEGORY
> PROPERTY LISTING
>
> In the site there are about 50 regions each region has 20 category or less
> and at the maximum one category
> can be as many as 2000( can display 20 property for each page). In order to
> get all information, my crawler is going to each property page using regular
> expression
> to extract specific info( Price, BR, Contact Info etc)..
>
> I have problem to decide when and where I can save it to database. Note that
> this crawler is scheduled to go to the website
> to get info every day and if the property information is not changed from
> last modification date the crawler is going to skip
> the property.
>
> I create the following tables to store those information
>
> Region Table
> ID, Region Name
>
> Category Table
> ID(1~20), Category Name
>
> Property Table
> ID, Category, Name, Address, Price, Contact Info, Bed Rooms, Contact Info,
> Location(Lat), Location(Lon)
There is insufficient information here to offer you much advice other than to suggest you take a class on Oracle PL/SQL programming and that, in the future, you always post information about tools, operating systems, hardware (when appropriate), and versions.
To design your schema would require a complete copy of the business rules and about $250/hr. Others will undoubtedly be less expensive. ;-)
-- Daniel A. Morgan University of Washington damorgan_at_x.washington.edu (replace x with u to respond) Puget Sound Oracle Users Group www.psoug.orgReceived on Wed Aug 16 2006 - 09:22:42 CDT