rOpenSci | Blog

All posts (Page 36 of 87)

The Antarctic/Southern Ocean rOpenSci community

🔗 Antarctic/Southern Ocean science and rOpenSci Collaboration and reproducibility are fundamental to Antarctic and Southern Ocean science, and the value of data to Antarctic science has long been promoted. The Antarctic Treaty (which came into force in 1961) included the provision that scientific observations and results from Antarctica should be openly shared. The high cost and difficulty of acquisition means that data tend to be re-used for different studies once collected....

Tesseract 4 is here! State of the art OCR in R!

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: Tesseract 4.0 includes a new neural network-based recognition engine that delivers significantly higher accuracy (on document images) than the previous versions, in return for a significant increase in required compute power. On complex languages however, it may actually be faster than base Tesseract....

Sharing the Recipe for rOpenSci's Unconf Ice Breaker

While many people groan at the thought of participating in a group ice breaker activity, we’ve gotten consistent feedback from people who have been to recent rOpenSci unconferences. Best ice breaker ever! We’ve had lots of requests for a detailed description of how we do it. This post shares our recipe, including a script you can adapt, a reflection on its success, examples of how others have used it, and some tips to remember....

Community Call - Working with images in R

rOpenSci’s software engineer / postdoc Jeroen Ooms will explain what images are, under the hood, and showcase several rOpenSci packages that form a modern toolkit for working with images in R, including opencv, av, tesseract, magick and pdftools. 🕘 Thursday, November 15, 2018, 10-11AM PST; 7-8PM CET (find your timezone) ☎️ Find all details for joining the call on our Community Calls page. Everyone is welcome. No RSVP needed. 🔗 Agenda Welcome (Stefanie Butland, rOpenSci Community Manager, 5 min) Working with images in R (Jeroen Ooms, 35 min) Q & A (20 min) 🔗 Abstract Images in various forms are used for numerous applications across scientific disciplines....

pubchunks: extract parts of scholarly XML articles

pubchunks is a package grown out of the fulltext package. fulltext provides a single interface to many sources of full text scholarly articles. As part of the user flow in fulltext there is an extraction step where fulltext::chunks() pulls parts of articles out of XML format article files. As part of making fulltext more maintainable and focused on simply fetching articles, and realizing that pulling out bits of structured XML files is a more general problem, we broke out pubchunks into a separate package....

Working together to push science forward

Happy rOpenSci users can be found at