Windows 7 Reconstituting HTML objects wiith text

gpkoz

New Member
Joined
Jun 17, 2012
Messages
2
I downloaded a number of text files in html using the "save as" command to files in a directory. Unforunately once I looked at them they were all separated into separate smaller files mostly of 3 kb or so. CSS and other objects landed in the same file but now I can't read the whole file in pieces. How can I reassemble then back together so it will run. Sometimes I find the .exe file but it won't work because they are all unhooked from eachother. My pdf's remain intact and I now only download pdf files. I would appreciate help if someone could tell me if there's any program I could use to put these objects into and rehook them to eachother. I have 3k number of files that need processing. Sorry for not using the right words to describe this but I never knew Windows 7 did such a thing. gpkoz@juno.com
 

Solution
It seems like your HTML files were saved in such a way that they are now fragmented into smaller pieces, making it difficult to read or run them successfully. To reassemble these files and make them usable again, you can try using a web scraping tool that can combine these fragmented files back into their original format. Here's a general approach you can take to reassemble these fragmented HTML files:
  1. Automate the Reassembly Process: You can use tools like Beautiful Soup in Python or other web scraping tools to automate the process of combining these smaller HTML files back into complete files. This approach will require some programming knowledge or assistance from someone with coding skills.
  2. Batch Processing:...
It seems like your HTML files were saved in such a way that they are now fragmented into smaller pieces, making it difficult to read or run them successfully. To reassemble these files and make them usable again, you can try using a web scraping tool that can combine these fragmented files back into their original format. Here's a general approach you can take to reassemble these fragmented HTML files:
  1. Automate the Reassembly Process: You can use tools like Beautiful Soup in Python or other web scraping tools to automate the process of combining these smaller HTML files back into complete files. This approach will require some programming knowledge or assistance from someone with coding skills.
  2. Batch Processing: Since you have a large number of files (3,000), you may want to create a script that can process them in batches. This script can iterate through the files, extract the necessary content, and merge them back together.
  3. Identify and Merge Dependencies: In addition to reassembling the HTML files, ensure that any external dependencies like CSS files, images, and scripts are properly linked back to the main HTML file.
  4. Testing and Verification: After reassembling the files, test them to ensure they function as expected and that all the content is correctly linked. If you're not familiar with programming or prefer a more user-friendly solution, you can explore software tools designed for web scraping and HTML manipulation. Some tools that might be useful include:
    • Data Scraping Tools: Tools like ParseHub, Octoparse, or OutWit Hub can help extract data from web pages and reassemble HTML content.
    • HTML Editors: Applications like Adobe Dreamweaver or Sublime Text provide robust features for editing and managing HTML files.
    Before proceeding, it's essential to back up your files to prevent any accidental loss or corruption during the reassembly process. Since you mentioned you now only download PDF files, this issue with fragmented HTML files should be avoided in the future.
 

Solution
Back
Top