Worf

Mapping HTML trees with VBA

Worf

Well-known Member
Joined
Oct 30, 2011
Messages
4,252
Worf submitted a new Excel article:

Mapping HTML trees with VBA - This article shows how to list parents and children that form the structure of Web pages.

When performing Web scraping, sometimes it is necessary to analyse how the page was constructed. Here are the main features of this post:


  • Lists the XPath for all pairs of parent and child for a local HTML test file.
  • Informs how many levels the page has and the tag for each element.
  • When tested with an actual Web page(www.mrexcel.com), it totalized 339 elements arranged on 12 levels.

View attachment 69228...

Read more about this Excel article...
 

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.

Forum statistics

Threads
1,216,503
Messages
6,131,022
Members
449,616
Latest member
PsychoCube

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top