QUAC documentation¶
Select your topic of interest at left. If you want to start at the beginning, try the installation instructions.
Note
While we try to keep the documentation current and comprehensive, our main goal with QUAC is using it for science. Because of this, the documentation is known to have parts which are stale or erroneous, and there are significant omissions. Patches to improve it are very welcome.
About QUAC¶
QUAC (“Quantitative analysis of chatter” or any related acronym you like) is a package for acquiring and analyzing social internet content. Features:
- Reliable data collection and conversion of raw data into into easy-to-parse,
de-duplicated, and well-ordered formats. We support:
- Tweets from the Twitter Streaming API.
- Wikipedia hourly aggregate pageview logs.
- Wikipedia edit history and related XML dumps.
- Estimate the origin location of tweets with no geotag. (But see issue #15.)
- Careful preservation of Unicode throughout the processing pipeline.
- Various cleanup steps to deal with tweet quirks, including very rare ones (we’ve seen certain weirdnesses in only one of our 1.3+ billion tweets). That is, we deal with the special cases so you don’t have to.
- Parallel processing using various combinations of Make,
joblib
, and a simple map-reduce framework called QUACreduce which is included.
QUAC is copyright © 2012-2015 Los Alamos National Security, LLC, and others. It is open source under the Apache license and was formerly known as Twepi (“Twitter for epidemic analysis”).
Reporting bugs¶
Use our list of issues. To maximize the chances of your bug being understood and fixed, take a look at “Three parts to every good bug report” (scroll down).
That said, note that unlike many open source projects, we make a point of being friendly to bug reporters, even newbies. Therefore, please don’t hesitate to report a bug, even if you’re inexperienced with QUAC or feel unsure. In almost all cases you will tell us something useful, even if the issue turns out not to be a bug per se, and we will support your efforts in this regard.
If you find QUAC useful¶
Please send us a note at reidpr@lanl.gov
if you use QUAC, even for small
uses, and/or star the project on GitHub. This type of feedback is very
important for continued justification of the project to our sponsors.
Note that for many uses of QUAC (especially research) you are ethically obligated to cite it. For guidelines on how to do this, see the Citing section of the documentation.
Science!¶
We use QUAC for scientific research. To promote reproducibility, which is one
of the core values of science, we try to open-source the code that runs our
related experiments as well as QUAC itself. This code, and further information
about it, can be found in the directory experiments
.
For more information¶
- Documentation is online at <http://reidpr.github.io/quac>. (Note: this may describe a different version of QUAC than the one you have.)
- Current documentation is rooted at
doc/index.html
. (You’ll probably need to build it first.) - Most scripts have pretty help which you can print using the
--help
option and/or look at in comments at the top of the script. Modules also usually have good docstrings.
Copyright¶
QUAC is copyright © 2012-2015 Los Alamos National Security, LLC and others.
This material was produced under U.S. Government contract DE-AC52-06NA25396 for Los Alamos National Laboratory (LANL), which is operated by Los Alamos National Security, LLC for the U.S. Department of Energy. The U.S. Government has rights to use, reproduce, and distribute this software. NEITHER THE GOVERNMENT NOR LOS ALAMOS NATIONAL SECURITY, LLC MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LIABILITY FOR THE USE OF THIS SOFTWARE. If software is modified to produce derivative works, such modified software should be clearly marked, so as not to confuse it with the version available from LANL.
This software is licensed under the Apache License, Version 2.0 (the “License”); you may not use it except in compliance with the License. A copy of the License is available in the file LICENSE or at http://www.apache.org/licenses/LICENSE-2.0.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This project contains code from Cyclopath, which is copyright (c) 2006-2012 Regents of the University of Minnesota and open source under the Apache license. Grep for “Cyclopath” to find the relevant files.
This software has been approved for open source release, LA-CC 12-038 and LA-CC 15-044.