Beyond Seven Review No. 52

Posted on Wed 09 September 2020 in Beyond Seven Review

Data analysis and Bayesian statistics ⚽

Free/libre and open-source software 🌺

Information and Geisteswissenschaften 🏺


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 51

Posted on Wed 26 August 2020 in Beyond Seven Review

(Quantitative) cultural studies 🚣

  • txtlab Multilingual Novels
    "This directory contains 450 novels that appeared between 1770 and 1930 in German, French and English. It is designed for us in teaching and research."
  • Towards Controllable Story Generation
    "We present a general framework of analyzing existing story corpora to generate controllable and creative new stories. The proposed framework needs little manual annotation to achieve controllable story generation. It creates a new interface for humans to interact with computers to generate personalized stories. We apply the framework to build recurrent neural network (RNN)-based generation models to control story ending valence and storyline. Experiments show that our methods successfully achieve the control and enhance the coherence of stories through introducing storylines. with additional control factors, the generation model gets lower perplexity, and yields more coherent stories that are faithful to the control factors according to human evaluation."

Data analysis and Bayesian statistics ⚽

  • Expert Knowledge Elicitation: Subjective but Scientific
    """Expert opinion and judgment enter into the practice of statistical inference and decision-making in numerous ways. Indeed, there is essentially no aspect of scientific investigation in which judgment is not required. Judgment is necessarily subjective, but should be made as carefully, as objectively, and as scientifically as possible. Elicitation of expert knowledge concerning an uncertain quantity expresses that knowledge in the form of a (subjective) probability distribution for the quantity. Such distributions play an important role in statistical inference (for example as prior distributions in a Bayesian analysis) and in evidence-based decision-making (for example as expressions of uncertainty regarding inputs to a decision model). This article sets out a number of practices through which elicitation can be made as rigorous and scientific as possible. One such practice is to follow a recognized protocol that is designed to address and minimize the cognitive biases that experts are …

Continue reading

Beyond Seven Review No. 50

Posted on Wed 12 August 2020 in Beyond Seven Review

(Quantitative) cultural studies 🚣

  • The man who taught BHO to cook
    A data-rich close reading of a book by Obama and a book by Ayers. (c.f. Has Critique Run out of Steam) """Obama's floors slant. Ayers' floors slope. Both use skillets. According to the American Heritage Dictionary, the word "skillet" seems "to have been confined to the Midland section of the country." Elsewhere, "frying pan" or "fry pan" is more common. Ayers grew up in Illinois. In "Dreams," which is about 130,000 words in length, the word kitchen appears 29 times. In "Fugitive Days," it appears 11 times in about 100,000 words. As a control, I used my 2000 novel, "2006: The Chautauqua Rising." I have very little interest in cooking. It shows. Although the novel has a 26 year-old male protagonist and any number of domestic settings, the word "kitchen" appears once in about 100,000 words. """
  • What do the experts know? Calibration, precision, and the wisdom of crowds among forensic handwriting experts
    Danielle J. Navarro

Book and publishing history 🚟

Data analysis and Bayesian statistics ⚽


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 49

Posted on Wed 29 July 2020 in Beyond Seven Review

Data analysis and Bayesian statistics ⚽

  • A neat reminder of just how unrepresentative an average can be:
    Superb visualization of recent economic history in the US (1980-2014). "A neat reminder of just how unrepresentative an average can be: Average US income growth 1980-2014 was 1.4% per year. But almost the entire income distribution - 1st-87th percentile - had income growth below this average. Graph from @PikettyLeMonde , Saez & @gabriel_zucman"

Free/libre and open-source software 🌺

Information and Geisteswissenschaften 🏺


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 48

Posted on Wed 15 July 2020 in Beyond Seven Review

(Quantitative) literary history and sociology of literature 🦉

  • Mid-Range Reading: Manifesto Edition
    Panel at DH2018. Wonderful references section. Participants: Grant Wythoff (grant.wythoff@gmail.com), Pennsylvania State University, United States of America y Alison Booth (ab6j@virginia.edu), University of Virginia, United States of America y Sarah Allison (sallison@loyno.edu), Loyola University New Orleans, United States of America y Daniel Shore (Daniel.Shore@georgetown.edu), Georgetown University, United States of America

Counterantidisintermediation 🌔

  • Thwarting Tech Giants
    Block traffic to tech giants at the router level. "Streamlined version of the tech in the Goodbye Big Five Series gizmodo.com/c/goodbye-big-five"

Data analysis and Bayesian statistics ⚽

Information and Geisteswissenschaften 🏺


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 47

Posted on Wed 01 July 2020 in Beyond Seven Review

Book and publishing history 🚟

  • How Digitization Has Created a Golden Age of Music, Movies, Books, and Television
    JEP piece by Waldfogel. Abstract: "Digitization is disrupting a number of copyright-protected media industries, including books, music, radio, television, and movies. Once information is transformed into digital form, it can be copied and distributed at near-zero marginal costs. This change has facilitated piracy in some industries, which in turn has made it difficult for commercial sellers to continue generating the same levels of revenue for bringing products to market in the traditional ways. Yet despite the sharp revenue reductions for recorded music, as well as threats to revenue in some other traditional media industries, other aspects of digitization have had the offsetting effects of reducing the costs of bringing new products to market in music, movies, books, and television. On balance, digitization has increased the number of new products that are created and made available to consumers. Moreover, given the unpredictable nature of product quality, growth in new products has given rise to substantial increases in the quality of the best products. Although there were concerns that consumer welfare from media products would fall, the opposite scenario has emerged—a golden age for consumers who wish to consume media products. ..."

Data analysis and Bayesian statistics ⚽

Information and Geisteswissenschaften 🏺


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 46

Posted on Wed 17 June 2020 in Beyond Seven Review

Book and publishing history 🚟

  • How to self-publish a book: A handy list of resources
    Comments about Bowker's monopoly on ISBNs in the USA.
  • The Elasticity of Demand With Respect to Product Failures; or Why the Market for Quack Medicines Flourished for More Than 150 Years
    Why demand for low-quality products remains (very) high. Abstract: "Between 1810 and 1939, real per capita spending on patent medicines grew by a factor of 114; real per capita GDP by a factor of 5. The long-term growth and survival this industry is puzzling when juxtaposed with standard historical accounts, which typically portray patent medicines as quack medicines. This paper argues that patent medicines were distinguished from other products by an unusually low elasticity of demand with respect to product failure. While consumers in other markets stopped searching for a viable product after a few failed attempts, consumers of patent medicines kept trying different products, irrespective of the number of failed medicines they observed. The market expanded as the stock of people buying potential cures accumulated over time. Because no one was ever cured and consumers possessed a highly inelastic demand with respect to product failures, demand was unrelenting. In short, patent medicines flourished not despite their dubious medicinal qualities, but because of them. There is also evidence that genuine medical advances, such as the rise of the germ theory of disease and new therapeutic interventions, helped expand the market for quack medicines."
  • What we talk about when we talk
    Interesting take on the Dan Mallory fraud.

Free/libre and open-source software 🌺


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 45

Posted on Wed 03 June 2020 in Beyond Seven Review

(Quantitative) literary history and sociology of literature 🦉

Data analysis and Bayesian statistics ⚽

  • The natural selection of bad science
    Abstract: Poor research design and data analysis encourage false-positive findings. Such poor methods persist despite perennial calls for improvement, suggesting that they result from something more than just misunderstanding. The persistence of poor methods results partly from incentives that favour them, leading to the natural selection of bad science. This dynamic requires no conscious strategizing—no deliberate cheating nor loafing—by scientists, only that publication is a principal factor for career advancement. Some normative methods of analysis have almost certainly been selected to further publication instead of discovery. In order to improve the culture of science, a shift must be made away from correcting misunderstandings and towards rewarding understanding. We support this argument with empirical evidence and computational modelling. We first present a 60-year meta-analysis of statistical power in the behavioural sciences and show that power has not improved despite repeated demonstrations of the necessity of increasing power. To demonstrate the logical consequences of structural incentives, we then present a dynamic model of scientific communities in which competing laboratories investigate novel or previously published hypotheses using culturally transmitted research methods. As in the real world, successful labs produce more ‘progeny,’ such that their methods are more often copied and their students are more likely to start labs of their own. Selection for high output leads to poorer methods and increasingly high false discovery rates. We additionally show that replication slows but does not stop the process of methodological deterioration. Improving the quality of research requires change at the institutional level.

Free/libre and open-source software 🌺


Continue reading

Beyond Seven Review No. 44

Posted on Wed 20 May 2020 in Beyond Seven Review

(Quantitative) cultural studies 🚣

Data analysis and Bayesian statistics ⚽

Information and Geisteswissenschaften 🏺


Beyond Seven Review appears regularly at https://www.beyondseven.org. Subscribe.


Continue reading

Beyond Seven Review No. 43

Posted on Wed 06 May 2020 in Beyond Seven Review

Book and publishing history 🚟

Data analysis and Bayesian statistics ⚽

  • Abandon Statistical Significance
    Abstract: "We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to “ban” p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly."

Free/libre and open-source software 🌺

Information and …


Continue reading