Skip to content

Commit

Permalink
Merge pull request #1 from bekicot/travis-fixes
Browse files Browse the repository at this point in the history
Travis Fixes
  • Loading branch information
bekicot authored Jul 21, 2017
2 parents dc399cf + 4dabc3f commit ceec82f
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 4 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
result.tar.gz
results/
htmls/
4 changes: 2 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
language: ruby

dist: trusty
rvm:
- 2.4.1

Expand All @@ -8,7 +8,7 @@ install:
- bundle install

script:
- bundle exec ruby extract_contents_csv.rb -t -r
- ruby extract_contents_csv.rb -t -r
- tar -czf result.tar.gz results

deploy:
Expand Down
3 changes: 1 addition & 2 deletions extract_contents_csv.rb
Original file line number Diff line number Diff line change
Expand Up @@ -87,15 +87,14 @@ def rebuild_cache
i = 0
t_number = 0
threads = []
mut = Mutex.new
index_page.css('td a').each_slice(1000) do |links|
threads << Thread.new do
links.each do |link|
tries = 3
LOGGER.info("fetching #{i}..#{i + 100}") if (i+=1) % 100 == 0
begin
url = URI(link.attr('href'))
File.write("htmls/#{url.to_s.split('/').last}", Net::HTTP.get(url))
system "curl -so #{"htmls/#{url.to_s.split('/').last}"} #{url}"
rescue Exception => e
retry unless (tries -= 1 ).zero?
LOGGER.error(e.message + " #{url.to_s}")
Expand Down

0 comments on commit ceec82f

Please sign in to comment.