The Wrong “Scientific” for Education

The release of National Assessment of Educational Progress (NAEP) 2019 scores in math and reading, announced as an “emergency” and “devastating,” has thrown gasoline on the rhetorical fire that has already been sweeping across media—a call for “scientific” research to save public education in the U.S.:

While the media and the public seem historically and currently convinced by the rhetoric of “scientific,” there is a significant irony to the lack of scientific evidence backing claims about the causes of NAEP scores; for example, some have rushed to argue that intensive phonics instruction and grade retention legislation have caused Mississippi’s NAEP reading gains while many have used 2019 NAEP scores to declare the entire accountability era a failure.

Yet, none of these claims have the necessary scientific evidence to make any of these arguments. There simply has not been the time or the efforts to construct scientific studies (experimental or quasi-experimental) to identify causal factors in NAEP score changes.

Another problem with the rhetoric of “scientific” is that coinciding with that advocacy is some very disturbing contradictory realities:

And let’s not forget that for at least two decades, “scientific” has been central to No Child Left Behind and the Common Core—both of which were championed as mechanisms for finally bringing education into a new era of evidence-based practices.

We must wonder: If “scientific” is the answer to our educational failures, what has happened over the past twenty years of “scientific” being legislated into education, resulting in everyone shouting that the sky is falling because 2019 NAEP scores are down from 2017 as well as relatively flat since the early 1990s (30 of the 40 years spanning accountability)?

First, there is the problem of definition. “Scientific” is short-hand for a very narrow type of quantitative research, experimental and quasi-experimental research that is the gold standard of pharmaceutical and medical research.

To meet the standard of “scientific,” then, research in education would have to include random-sample populations of students and a control group in order to draw causal relationships and make generalizations. This process is incredibly expensive in terms of funding and time.

As I noted above, no one has had the time to conduct “scientific” research on 2019 NAEP data so making causal claims of any kind for why NAEP scores dropped is necessarily not “scientific.”

But there is a second, and larger, problem with calling for “scientific” research in education. This narrow form of “scientific” is simply wrong for education.

Experimental and quasi-experimental research seeks to identify causal generalizations. In other words, if we divide all students into a bell-shaped curve with five segments, the meaty center segment would be where the generalization from a study has the greatest effectiveness. The adjacent two outer segments would show some decreasing degrees of effectiveness, leaving the two extreme segments at the far ends of the curve likely showing little or no effectiveness (these students, however, could have learned under instruction not shown as generally effective).

Yet, in a real classroom, teachers are not serving a random sampling of students, and there are no controls to assure that some factors are not causing different outcomes for students even when the instructional practice has been shown by scientific research to be effective.

No matter the science behind instruction, sick, hungry, or bullied students will not be able to learn.

The truth is, in education, scientific studies are nearly impossible to conduct, are often overly burdensome in terms of expense and time, and are ultimately not adequate for the needs of real teachers and students in real classrooms—where teaching and learning are messy, idiosyncratic, and impacted by dozens of factors beyond the control of teachers or students.

Frankly, nothing works for all students, and a generalization can be of no use to a particular student with an outlier need.

While we are over-reacting to 2019 NAEP reading scores, we have failed to recognize that there has never been a period in the U.S. when reading achievement was adequate; over that history teachers have implemented hundreds of different instructional strategies, reading programs, standards, and high-stakes tests—and we always find the outcomes unsatisfying.

If there is any causal relationship between how we teach and how students learn, it is a cumbersome matrix of factors that has been mostly unexamined, especially by “scientific” methods.

And often, history is a better avenue than science.

The twenty-first century has not been the only era calling for “scientific” in educational practice.

The John Dewey progressivism of the early twentieth century was also characterized by a call for scientific practice. Lou LaBrant, who taught from 1906 until 1971 and rose to president of the National Council of Teachers of English in the 1950s, was a lifelong practitioner of Deweyan progressivism.

LaBrant called repeated for closing the “gap” between research and practice, but she also balked at reading and writing programs—singular approaches to teaching all students literacy.

While progressive education and Dewey are often demonized and blamed for educational failure by mid-twentieth century, the truth is that progressivism has never been widely embraced in the U.S.

Today, however, we should be skeptical of the narrow and flawed call for “scientific” and embrace instead the progressive view of “scientific.”

For Dewey, the teacher must simultaneously teach and conduct research—what eventually would be called action research.

To teach, for progressives, is to constantly gather evidence of learning from students in order to drive instruction; in this context, science means that each student receives the instruction they demonstrate a need for and that produces some outcomes of effectiveness.

In an elementary reading class, some students may be working in read aloud groups while others are receiving direct phonics instruction, and even others are sitting in book clubs reading picture books by choice. None of them, however, would be doing test-prep worksheets or computer-based programs.

The current urge toward “scientific” seems to embrace the false assumption that with the right body of research we can identify the single approach for all students to succeed.

Human learning, however, is as varied as there are humans.

This brings us to the current “science of reading” narrative that calls for all students to receive intensive systematic phonics, purportedly because scientific research calls for such. The “science of reading” narrative also rejects and demonizes “balanced literacy” as not “scientific.”

We arrive then back at the problem of definition.

The “science of reading” advocacy is trapped in too narrow a definition of “scientific” that is fundamentally wrong for educating every student. Ironically, again, balanced literacy is a philosophy of literacy (not a program) that implements Deweyan progressive “scientific”; each student receives the reading instruction they need based on the evidence of learning the teacher gathers from previous instruction, evidence used to guide future instruction.

Intensive phonics for all begins with a fixed mandate regardless of student ability or need; balanced literacy starts with the evidence of the student.

If we are going to argue for “scientific” education in the U.S., we would be wise to change the definition, expand the evidence, and tailor our instruction to the needs of our students and not the propagandizing of a few zealots.

For two decades at least, the U.S. has been chasing the wrong “scientific” as a distraction from the real guiding principle, efficiency. Reducing math and reading to discrete skills and testing those skills as evidence for complex human behaviors are as misleading as arguing that “scientific” research will save education.

Teachers as daily, patient researchers charged with teaching each student as that student needs—that is the better “scientific” even as it is much messier and less predictable.

On Poetry and Prose: Defining the Undefinable

As a professor of first-year writing, I spend a good deal of time helping students unpack what they have learned about the essay as a form and about writing in order to set much of that aside and embrace more nuanced and authentic awareness about both.

Teaching writing is also necessarily entangled with teaching reading. In my young adult literature course, then, I often ask students, undergraduate and graduate (many practicing teachers), to do similar unpacking about their assumptions concerning writing and reading.

I have noted before that my first-year students often mangle what I would consider to be very basic labels for writing forms and genres—calling a short prose piece a poem and identifying a play as a novel because they read both in book form.

Because of the ways students have been taught writing to comply with accountability and high-stakes testing of writing, they also confuse modes (narration, description, exposition, and persuasion) for genres or types of essays.

These overly simplistic or misguided ideas extend to distinguishing between fiction and non-fiction as well as prose and poetry.

I am always adding to my toolkit, then, lessons that ask students to investigate and interrogate genre, form, and mode, instilling a sense that literacy remains something undefinable that we none the less try to define so that we feel we have greater control over it.

This post details a lesson about recognizing all literacy as a journey, and embracing defining the undefinable.

The seeds of the lesson, in fact, start with my own stumbling through my journey with literacy. The first time I read Gate A-4 by Naomi Shihab Nye, I assumed the piece was a personal essay.

I think I may have shared with students and even referred to the passage as such. At some point after that, I ran across the piece being referred to as fiction, a very brief short story.

This week, as I was planning a lesson on how we distinguish poetry from short fiction, I considered using “Gate A-4” along with four poems by women poets—Adrienne Rich’s “Aunt Jennifer’s Tigers,” Maggie Smith’s “Good Bones,” Emily Dickinson’s “Wild night – Wild nights!,”  and Margaret Atwood’s “Siren Song.”

As I searched online for “Gate A-4,” I noticed that the piece was routinely identified as a poem. However, when I did a “Look Inside” search of Naomi Shihab Nye’s Honeybee: Poems & Short Prose, I discovered that the piece is clearly prose, one of what the book description identifies as “eighty-two poems and paragraphs.”

I also discovered a wonderful video of Nye reading the passage:

This became the opening for the lesson, which began with asking students to watch the read aloud without a text in front of them. After viewing, I asked them to identify the text form—what is this thing she is reading?

The students were cautious, even hesitant to answer, exposing, I think, the many elements of a text that advanced readers use to make a significant number of decisions in a very brief moment. We know poetry from prose simply from seeing the text, even before reading.

As we struggled, I handed out a copy of “Gate A-4” and explained it is prose (although some guessed poem). I also pulled up the amazon link and showed them the piece in the original book.

Next I placed them in small groups with the four poems noted above, asking them to use one or as many of them as they wanted to create a quick lesson on what makes a poem, a poem.

The first group decided to use all four poems, and began by noting students would identify what most people associated with poetry in “Aunt Jennifer’s Tigers”—rhyme and stanzas.

They also recognized that turning to “Good Bones,” those assumptions were challenged, as they explained, since this poem didn’t rhyme and has no stanzas (which we later clarified to note it is simply one stanza, constructed of lines).

Since “Aunt Jennifer’s Tigers” and “Wild night – Wild nights!” tend to conform to narrow and traditional characteristics associated with poetry and “Good Bones” and “Siren Song” look poetic but sit outside those characteristics, we began to brainstorm how to have broader concepts; for example, we explored that all the poems have repetition (noting that rhyme is sound repetition) and concluded that poetry is often driven by purposeful line form and stanzas.

Possibly the key moment of this discussion was when the second group added that the best we can say is that a poem is a poem because the writer identifies it as such. We have come to a similar conclusion about the genre of young adult literature.

Another important part of this exploration came from a student who explained that he had always been bothered by trying to write poetry in high school, specifically the concept of line breaks. The how of breaking lines eluded him.

Here is something I always emphasize when teaching someone to write poetry—the art and craft of line breaks.

Broadly, we can help students better understand form and genre by keeping them focused on prose as a work driven by purposeful sentence and paragraph formation and poetry as a work driven by purposeful line and stanza formation (recognizing that even poetry sometimes is prose poetry).

To help answer this student’s concern about line breaks, I pulled up my newest poem about my father’s death, “quotidian,” and walked the class through my first draft (typed in Notes on my iPhone and emailed to myself) as well as how I came to choose and then work within the stanza pattern.

The big-picture lessons from this activity include the following:

  • Helping students understand that writing forms, genres, and modes are driven (not constrained) by some conventions, but also fluid.
  • Exploring that writers of all types of genres and forms work from a very similar toolbox—writers of poetry and prose care about sound, for example.
  • Emphasizing form and meaning are related in writing, but as soon as anyone finds a firm definition, a piece challenges that.
  • Identifying how writers and readers navigate form, genre, and mode with purposefulness as well as awareness. As I explained about line breaks and stanzas when writing poetry, there is no magical formula, but most poets do seek some guiding pattern or patterns and then shape poetry within or against those patterns.

Many years ago as a high school English teacher, I gradually shifted away from defining poetry during our poetry unit, and choosing instead to ask throughout, “What makes poetry, poetry?” We simply came to understand poetry better by asking a question instead of finding a clear definition.

I remain convinced that seeking greater awareness about text is a long journey, best guided by always seeking a definition rather than imposing one.

Regardless of the definition we discover, or fail to uncover, I hope that students remain in awe as I am each time I read “Gate A-4” even as I also remain conflicted about just what the thing is she is reading aloud on the video.

What Is the Relationship among NAEP Scores, Educational Policy, and Classroom Practice?

Annually, the media, public, and political leaders over-react and misrepresent the release of SAT and ACT scores from across the US. Most notably, despite years of warnings from the College Board against the practice, many persist in ranking states by average state scores, ignoring that vastly different populations are being incorrectly compared.

These media, public, and political reactions to SAT and ACT scores are premature and superficial, but the one recurring conclusion that would be fair to emphasize is that, as with all standardized test data, the most persistent correlation to these scores includes the socio-economic status of the students’ families as well as the educational attainment of their parents.

Over many decades of test scores, in fact, educational policy and classroom practices have changed many times, and the consistency of those policies and practices have been significantly lacking and almost entirely unexamined.

For example, when test scores fell in California in the late 1980s and early 1990s, the media, public, and political leaders all blamed the state’s shift to whole language as the official reading policy.

This was a compelling narrative that, as I noted above, proved to be premature and superficial—relying on the most basic assumptions of correlation. A more careful analysis exposed two powerful facts: California test scores were far more likely to have dropped because of drastic cuts to educational funding and a significant influx of English language learners and (here is a very important point) even as whole language was the official reading policy of the state, few teachers were implementing whole language in their classrooms.

This last point cannot be emphasized enough: throughout the history of public education, because teaching is mostly a disempowered profession (primarily staffed by women), one recurring phenomenon is that teachers often shut their doors and teach—claiming their professional autonomy by resisting official policy.

November 2019 has brought us a similar and expected round of making outlandish and unsupported claims about NAEP data. With the trend downward in reading scores since 2017, this round is characterized by the sky-is-falling political histrionics and hollow fist pounding that NAEP scores have proven policies a success or a failure (depending on the agenda).

If we slip back in time just a couple decades, when the George W. Bush administration heralded the “Texas miracle” as a template for No Child Left Behind, we witnessed a transition from state-based educational accountability to federal accountability. But this moment in political history also raised the stakes on scientifically based educational policy and practice.

Specifically, the National Reading Panel was charged with identifying the highest quality research in effective reading programs and practices. (As a note, while the NRP touted its findings as scientific, many, including a member of the panel itself [1], have discredited the quality of the findings as well as accurately cautioning against political misuse of the findings to drive policy).

Here is where our trip back in history may sound familiar during this current season of NAEP hand wringing. While Secretary of Education (2005-2009), Margaret Spellings announced that a jump of 7 points in NAEP reading scores from 1999-2005 was proof No Child Left Behind was working. The problem, however, was in the details:

[W]hen then-Secretary Spellings announced that test scores were proving NCLB a success, Gerald Bracey and Stephen Krashen exposed one of two possible problems with the data. Spellings either did not understand basic statistics or was misleading for political gain. Krashen detailed the deception or ineptitude by showing that the gain Spellings noted did occur from 1999 to 2005, a change of seven points. But he also revealed that the scores rose as follows: 1999 = 212; 2000 = 213; 2002 = 219; 2003 = 218 ; 2005 = 219. The jump Spellings used to promote NCLB and Reading First occurred from 2000 to 2002, before the implementation of Reading First. Krashen notes even more problems with claiming success for NCLB and Reading First, including:

“Bracey (2006) also notes that it is very unlikely that many Reading First children were included in the NAEP assessments in 2004 (and even 2005). NAEP is given to nine year olds, but RF is directed at grade three and lower. Many RF programs did not begin until late in 2003; in fact, Bracey notes that the application package for RF was not available until April, 2002.”

Jump to 2019 NAEP data release to hear Secretary of Education Betsy DeVos shout that the sky is falling and public education needs more school choice—without a shred of scientific evidence making causal relationships of any kind among test data, educational policy, and classroom practice.

But an even better example has been unmasked by Gary Rubinstein who discredits Louisiana’s Chief of Change John White (praised by former SOE Arne Duncan) proclaiming his educational policy changes caused the state’s NAEP gain in math:

So while, yes, Louisiana’s 8th grade math NAEP in 2017 was 267 and their 8th grade math NAEP in 2019 was 272 which was a 5 point gain in that two year period and while that was the highest gain over that two year period for any state, if you go back instead to their scores from 2007, way before their reform effort happened, you will find that in the 12 year period from 2007 to 2019, Louisiana did not lead the nation in 8th grade NAEP gains.  In fact, Louisiana went DOWN from a scale score of 272.39 in 2007 to a scale score of 271.64 in 2019 on that test.  Compared to the rest of the country in that 12 year period.  This means that in that 12 year period, they are 33rd in ‘growth’ (is it even fair to call negative growth ‘growth’?).  The issue was that from 2007 to 2015, Louisiana ranked second to last on ‘growth’ in 8th grade math.  Failing to mention that relevant detail when bragging about your growth from 2017 to 2019 is very sneaky.

The media and public join right in with this political playbook that has persisted since the early 1980s: Claim that public education is failing, blame an ever-changing cause for that failure (low standards, public schools as monopolies, teacher quality, etc.), promote reform and change that includes “scientific evidence” and “research,” and then make unscientific claims of success (or yet more failure) based on simplistic correlation and while offering no credible or complex research to support those claims.

Here is the problem, then: What is the relationship among NAEP scores, educational policy, and classroom practice?

There are only a couple fair responses.

First, 2019 NAEP data replicate a historical fact of standardized testing in the US—the strongest and most persistent correlations to that data are with the socio-economic status of the students, their families, and the states. When students or average state data do not conform to that norm, these are outliers that may or may not provide evidence for replication or scaling up. However, you must consider the next point as well.

Second, as Rubinstein shows, the best way to draw causal relationship among NAEP data, educational policy, and classroom practices is to use longitudinal data; I would recommend at least 20 years (reaching back to NCLB), but thirty years would add in a larger section of the accountability era that began in the 1980s but was in wide application across almost all states by the 1990s.

The longitudinal data would next have to be aligned with the current educational policy in math and reading for each state correlated with each round of NAEP testing.

As Bracey and Krashen cautioned, that correlation would have to accurately align when the policy is implemented with enough time to claim that the change impacted the sample of students taking NAEP.

But that isn’t all, even as complex and overwhelming as this process demands.

We must address the lesson from the so-called whole language collapse in California by documenting whether or not classroom practice implemented state policy with some measurable level of fidelity.

This process is a herculean task, and no one has had the time to examine 2019 NAEP data in any credible way to make valid causal claims about the scores and the impact of educational policy and classroom practice.

What seems fair, however, to acknowledge is that there is no decade over the past 100 years when the media, public, and political leaders deemed test scores successful, regardless of the myriad of changes to policies and practices.

Over the history of public education, also, before and after the accountability era began, student achievement in the US has been mostly a reflection of socio-economic factors, and less about student effort, teacher quality, or any educational policies or practices.

If NAEP data mean anything, and I am prone to say they are much ado about nothing, we simply do not know what that is because we have chosen political rhetoric over the scientific process and research that could give us the answers.


[1] See:

Babes in the Woods: The Wanderings of the National Reading Panel, Joanne Yatvin

Did Reading First Work?, Stephen Krashen

My Experiences in Teaching Reading and Being a Member of the National Reading Panel, Joanne Yatvin

I Told You So! The Misinterpretation and Misuse of The National Reading Panel Report, Joanne Yatvin

The Enduring Influence of the National Reading Panel (and the “D” Word)

 

On Normal, ADHD, and Dyslexia: Neither Pathologizing, Nor Rendering Invisible

In 1973, Elliott Kozuch explains, “the American Psychiatric Association (APA) — the largest psychiatric organization in the world — made history by issuing a resolution stating that homosexuality was not a mental illness or sickness. This declaration helped shift public opinion, marking a major milestone for LGBTQ equality.”

Homosexuality in many eras and across many cultures has been rendered either invisible (thus, the “closet” metaphor) or pathologized as an illness (thus, the horror that is conversion therapy).

This troubling history of responses to homosexuality confronts the inexcusable negative consequences of shame and misdiagnosis/mistreatment against the more humane and dignified recognition that “normal” in human behaviors is a much broader spectrum than either invisibility or pathologizing allows.

How we determine “normal” in formal education is profoundly important, and the current rise of dyslexia advocacy as that impacts and drives reading legislation and practice for all students parallels the dangers identified above with rendering invisible or pathologizing children who struggle with reading.

woman sitting on bed while holding book

Photo by David Lezcano on Unsplash

Further, this more recent focus on dyslexia looks incredibly similar to the increased diagnosis of ADHD, which was initially left invisible and then pathologized (probably over-diagnosed and heavily medicated).

Let’s focus first, then, on ADHD, and how the dynamic of “normal,” “invisible,” and “pathologized” impacts children.

In 2013 Maggie Koerth-Baker reported:

The number of diagnoses of Attention Deficit Hyperactivity Disorder has ballooned over the past few decades. Before the early 1990s, fewer than 5 percent of school-age kids were thought to have A.D.H.D. Earlier this year, data from the Centers for Disease Control and Prevention showed that 11 percent of children ages 4 to 17 had at some point received the diagnosis — and that doesn’t even include first-time diagnoses in adults.

But here is the problem:

That amounts to millions of extra people receiving regular doses of stimulant drugs to keep neurological symptoms in check. For a lot of us, the diagnosis and subsequent treatments — both behavioral and pharmaceutical — have proved helpful. But still: Where did we all come from? Were that many Americans always pathologically hyperactive and unable to focus, and only now are getting the treatment they need?

Probably not. Of the 6.4 million kids who have been given diagnoses of A.D.H.D., a large percentage are unlikely to have any kind of physiological difference that would make them more distractible than the average non-A.D.H.D. kid. It’s also doubtful that biological or environmental changes are making physiological differences more prevalent. Instead, the rapid increase in people with A.D.H.D. probably has more to do with sociological factors — changes in the way we school our children, in the way we interact with doctors and in what we expect from our kids.

For context, when I was exploring the ADHD phenomenon in 2013, I ran across a provocative piece from 2012 about ADHD in France, Why French Kids Don’t Have ADHD, published in Psychology Today. Immediately, this spoke to my concern about both pathologizing human behavior that may be within a broader understanding of normal and my skepticism about immediately medicating, instead of addressing diet, environment, etc.

However, the situation in France is far more complicated as noted in a piece also published by Psychology Today in 2015 , French Kids DO Have ADHD, this time acknowledging:

In other words, it’s not that French kids, or Europeans, don’t have ADHD, says French child psychiatrist Michel Lecendreux, but that they’re clinically invisible [emphasis added]. “It’s just not very well understood, nor is it very well-diagnosed, nor well-treated.” Lecendreux, a researcher at the Robert Debre Hospital in Paris who also heads the scientific commission for the French ADHD support group HyperSupers, told me that his research suggests that fewer than one-third of French children who have ADHD are being diagnosed.

The circumstances around ADHD in France reveal the power of narratives and cultural responses to human behavior, any people’s perception of “normal.” A study by Sébastien Ponnou and François Gonon from 2017, in fact, details the pervasiveness of different narratives about ADHD in French media:

Two models of attention deficit hyperactivity disorder (ADHD) coexist: the biomedical and the psychosocial. We identified in nine French newspapers 159 articles giving facts and opinions about ADHD from 1995 to 2015. We classified them according to the model they mainly supported and on the basis of what argument. Two thirds (104/159) mainly supported the biomedical model. The others either defended the psychodynamic understanding of ADHD or voiced both models. Neurological dysfunctions and genetic risk factors were mentioned in support of the biomedical model in only 26 and eight articles, respectively. These biological arguments were less frequent in the most recent years. There were fewer articles mentioning medication other than asserting that medication must be combined with psychosocial interventions (14 versus 57 articles). Only 11/159 articles claimed that medication protects from school failure. These results were compared to those of our two previous studies. Thus, both French newspapers and the specialized press read by social workers mainly defended either the psychodynamic understanding of ADHD or a nuanced version of the biomedical model. In contrast, most French TV programmes described ADHD as an inherited neurological disease whose consequences on school failure can be counteracted by a very effective medication. (abstract)

Back in the US, in Room for Debate from 2016, several experts challenged overpathologizing children with ADHD labels, the racial disparity in that pathologizing, and the dangers of medicating for ADHD as an avenue to controlling children.

Thus, the interaction among the fields of medicine and psychology, media representations of clinical conditions, and the spectrum along “normal,” “invisible,” and “pathologized” has profound consequences for children/teens and formal education.

Currently, we are witnessing mainstream media build a compelling narrative about the “science of reading” and the needs of children with dyslexia; this is a narrative about children with dyslexia being rendered invisible and there existing a “science of reading” that is the medicine necessary to cure that pathology.

However, as the examinations of homosexuality and ADHD above demonstrate, when it comes to the humanity and dignity of children being served by the institution of public education, we cannot tolerate either rendering them and their behaviors invisible or over-pathologizing, and thus misdiagnosing/mistreating, them.

This leads to the current rush to assess and identify dyslexia as a foundational part of teaching all children to read, policies about which the International Literacy Association (2016) offer several concern:

Errors in reading and spelling made by children classified as dyslexic are not reliably different from those of younger children who are not classified as dyslexic. Rather, evidence suggests that readers with similar levels of competence make similar kinds of errors. This does not suggest a greater incidence of dyslexia, but instead that some difficulties in learning to work with sounds are normal.

Yet, the rise in advocacy for identifying dyslexia has gained significant momentum in state policy even as ILA warns:

Some have advocated for an assessment process that determines who should and should not be classified as dyslexic, but this process has been shown to be highly variable across states and districts in the United States, of questionable validity, and too often resulting in empirically unsupported, one-size-fits-all program recommendations [emphasis added].

No child struggling to read should have that struggle rendered invisible, but pathologizing behavior that does not conform to a narrow definition of normal also carries significant and negative consequences. As ILA notes above, a more reasonable approach is simply to expand the spectrum of normal while building a supportive environment tempered with patience.

I teach a graduate student whose child is now in a school for dyslexic children. That child was floundering personally and academically in traditional school, and now flourishes, something everyone would applaud.

The parent, however, made a really powerful observation, noting that the child’s recent success comes in a school that champions Orton-Gillingham-based reading programs [1] (often OG for short).

Advocates for universal screening for dyslexia also advocate for systematic intensive phonics for all students, specifically OG. Yet, this child is now in a school with a 1-8 teacher-student ratio and a guaranteed 1.5 hours a day with 1-2 teacher-student ratio instruction.

The parent stated flatly that almost any child would flourish in those conditions and the different way the child is being taught to read is not necessarily the real cause of the new success. I must add, we absolutely have no research exploring these dynamics and controlling for variables that would help us understand the importance of reading programs versus learning/teaching conditions (see, for example, unfounded and overstated responses to 2019 NAEP reading scores).

Struggling to read is, in fact, quite normal, and a long, chaotic process. Teaching reading is very complex, unique to each child, and as ILA clarifies, “there is no certifiably best method for teaching children who experience reading difficulty.”

Demands that all children attain some prescribed proficiency in reading by third grade are artificial and themselves unnatural, abnormal.

No child should be invisible in schools, but pathologizing childhood behavior that is quite normal because some adults have irresponsible deadlines and expectations for those children is inexcusable.

Teaching children to read needs a new normal, one that acknowledges the power of learning and living conditions while avoiding the dangers of finding fault in any child that we can simply cure with some magical quick fix.


[1] From ILA:

[R]esearch does not support the common belief that Orton-Gillingham–based approaches are necessary for students classified as dyslexic (Ritchey & Goeke, 2007; Turner, 2008; Vaughn & Linan-Thompson, 2003). Reviews of research focusing solely on decoding interventions have shown either small to moderate or variable effects that rarely persist over time, and little to no effects on more global reading skills. Rather, students classified as dyslexic have varying strengths and challenges, and teaching them is too complex a task for a scripted, one-size-fits-all program (Coyne et al., 2013; Phillips & Smith, 1997; Simmons, 2015). Optimal instruction calls for teachers’ professional expertise and responsiveness, and for the freedom to act on the basis of that professionalism.

Resisting the Silver Bullet in Literacy Instruction (and Dyslexia): “there is no certifiably best method for teaching children who experience reading difficulty”

The Mind, Explained episode 1, Memory, introduces readers to some disorienting facts about human memory, transported in the soothing and authoritative voice-over by Emma Stone.

The episode shares a 9-11 memory from a young woman, recalling sitting as a child in her classroom and watching the smoke from the Twin Tower collapse billowing past the window as she worried about her mother working in the city.

Her memory is vivid and compelling, but it also factually wrong—both the detail of the billowing smoke (the window didn’t face that direction and the proximity of the school would not have allowed that event to occur) and her mother was not in the city that day.

Memory, the episode reveals, is often deeply flawed, as much a construction by the person as any sort of accurate recall.

Watching this, I thought about one of the most misinterpreted poems commonly taught in schools, Robert Frost’s “The Road Not Taken.”

This poem, and how people almost universally misread it, is parallel to the problems with memory in that people tend to impose onto text what we predict or want that text to say; and the verbatim elements of a text, the raw decoding of words, also depends heavily on schema, what the reader knows and the correlations that reader makes with words and phrases.

Frost’s poem, by the way, is about the significance of choosing, in that when we choose we determine our path. But the poem literally states multiple times that the paths are the same; therefore, the poem is not some inspirational poster about making the right choice—although this is the sort of simplistic message many people want to read.

Since I have been wrestling with the recent rise of dyslexia advocates calling for intensive systematic phonics instruction and dyslexia screening for all students as well as the over-reaction to the 2019 reading data from NAEP, I believe the memory and poem interpretation phenomena help explain how and why the “science of reading” narrative is so effective while simultaneously being deeply misguided (and misleading).

The media and most people find a single explanation for reading problems compelling; the argument that more students have dyslexia than are being identified and that one program type (intensive systematic phonics, usually Orton-Gillingham–based) will cure the low reading achievement crisis matches what people want to hear.

The disturbing irony is that those oversimplifying reading challenges and solutions as “the science of reading” are themselves not being very scientific even as they idealize “scientific research.”

I have argued against this in education for many years, and have identified this broadly as technocratic, an over-reliance on narrow types of measurement in order to control the teaching/learning process in ways that are not realistic in real-world classrooms.

The call for reading instruction driven by the “science of reading,” then, comes against several problems. First, literacy acquisition and instruction are both inherently messy and chaotic. Despite our seeking efficient and effective methods, mandating that all students develop the same ways and at the same rates is futile, and harmful. Concurrent with that reality, highly structured teaching of literacy is something that is manageable but likely ineffective, and again, harmful.

Narrow expectations for “scientific” tend to include controlling for external factors and reaching generalizable conclusions—both of which can be inappropriate for guiding teaching real students in an actual classroom.

Should reading policy and practice be informed by scientific studies? Of course, but any teacher must frame that against the needs of each student, needs that may dictate practices outside the parameters of narrow research. And every teacher has another type of evidence—their practice.

We must also, recognize, however, that the science of reading, and the science around dyslexia, are both not as clearcut as some advocates seem to suggest.

When we ask the questions being posed now—Is there a reading crisis (distinct from the historical trends of reading test scores)? Is there a dyslexia crisis? Should all students be screened for dyslexia? Should all students receive intensive systematic phonics instruction?—the answers do not match the current media and advocacy frenzy.

The 2016 International Literacy Associations Reading Advisory on Dyslexia offers a much different framing of “scientific,” in fact:

Both informal and professional discussions about dyslexia often reflect emotional, conceptual, and economic commitments, and they are often not well informed by research.

First, some children, both boys and girls, have more difficulty than others in learning to read and write regardless of their levels of intelligence or creativity. When beginning literacy instruction is engaging and responsive to children’s needs, however, the percentage of school children having continuing difficulty is small (Vellutino et al., 1996; Vellutino, Scanlon, & Lyon, 2000).

Second, the nature and causes of dyslexia, and even the utility of the concept, are still under investigation [emphasis added]….

Third, dyslexia, or severe reading difficulties, do not result from visual problems producing letter and word reversals (Vellutino, 1979)….

Errors in reading and spelling made by children classified as dyslexic are not reliably different from those of younger children who are not classified as dyslexic. Rather, evidence suggests that readers with similar levels of competence make similar kinds of errors. This does not suggest a greater incidence of dyslexia, but instead that some difficulties in learning to work with sounds are normal [emphasis added]….

[I]nterventions that are appropriately responsive to individual needs have been shown to reduce the number of children with continuing difficulties in reading to below 2% of the population [emphasis added] (Vellutino et al., 2000).

As yet, there is no certifiably best method for teaching children who experience reading difficulty [emphasis added] (Mathes et al., 2005). For instance, research does not support the common belief that Orton-Gillingham–based approaches are necessary for students classified as dyslexic [emphasis added] (Ritchey & Goeke, 2007; Turner, 2008; Vaughn & Linan-Thompson, 2003)….

Some have advocated for an assessment process that determines who should and should not be classified as dyslexic, but this process has been shown to be highly variable across states and districts in the United States, of questionable validity, and too often resulting in empirically unsupported, one-sizefits-all program recommendations [emphasis added].

The science of reading actually suggests we should not see a reading crisis in our test data, and not conflate low test scores in reading with special needs such dyslexia.

And we must reject calls to adopt singular evaluation processes and reading programs that claim to address the natural development and challenges of 98% of students.

All general population students and students with special needs need rich and robust literacy instruction that helps teachers recognize their needs in order to foster their natural development over many years (resisting as well the false notion that 3rd grade is a magical moment for all students to attain the same mastery of literacy).

While there are certainly good intentions behind calls for identifying and serving students with dyslexia, the over-reaction to reading test scores and oversimplification of “scientific” are pathologizing and stigmatizing students, and eroding effective teaching and authentic learning.

Like memory and poems by Frost, teaching reading and becoming a reader are complicated. Many find that so unpleasant, they have retreated into a mantra that isn’t itself very well grounded in evidence, the hallmark of “scientific.”

The evidence we use on reading, test scores, has for at least a century shown us that there really is no normal development of reading at predictable benchmarks and that measurable reading achievement has always been and continues to be a powerful marker for the socio-economic status of the students tested.

Technocrats do not want evidence that is historical and sociological, however, preferring instead to impose a problem and a solution onto the data in ways that are as comforting as a detailed, though significantly flawed, memory.

See Also

Scientific evidence on how to teach writing is slim

Proof Points (Writing Instruction)

The Big Lie about the “Science of Reading”: NAEP 2019 Edition

After the release of the 2017 NAEP reading scores, states such as Mississippi launched a campaign to celebrate the success of their reading legislation. This effort coincided with a recent explosion in states adopting reading legislation driven by dyslexia advocates who promote systematic intensive phonics for all students.

The claims coming from Mississippi didn’t seem credible, so I began what turned into a very long (and maybe endless) examination of the growing power of dyslexia advocates to drive what are essentially very bad forms of reading legislation, notably third grade retention and systematic intensive phonics for all students.

In my initial analysis of 2017 NAEP reading scores for 4th and 8th grades, I addressed the use of “the science of reading” as veneer for ideological advocacy; I also focused on the misuse by dyslexia/phonics advocates and the media of the National Reading panel and flawed claims about and definitions of “balanced literacy” and “whole language,” including mostly ahistorical understandings of how reading has been taught and discussed in political and public forums.

With the release of 2019 NAEP data, as we should expect, the same folk are back at over-reacting and misunderstanding standardized reading test data (mostly mainstream media), and dyslexia/phonics advocates are cherry picking evidence to reinforce their ideological advocacy.

All in all, these responses to NAEP data are lazy, and incredibly harmful.

Broadly, responses by the media and advocates have been overly simplistic, and lacking even a modicum of effort to tease out in a scientific way (ironic, eh?) mere correlations from actual causal associations among student demographics, reading policy, reading programs, the fidelity of implementing policy/programs, NAEP testing quality (how valid a proxy is NAEP reading tests for critical reading ability?), etc.

In a Twitter thread, I attempt to make a case against rushing to judgment based on 2019 NAEP reading data:

A little NAEP thread:

In 2017 MS made overstated claims about their NAEP reading scores, hiding the fact that 4th grade bumps disappeared by 8th grade and that NAEP scores remain mostly correlated with poverty; see:

The Big Lie about the “Science of Reading” (Updated)

2019 NAEP reading scores are likely to be a reboot of that for MS since 4th grade reading is an outlier among states in terms of gains but MS remains about average in 4th grade.

But very low still in 8th.

Only fair things to say about new round of NAEP reading scores:

• The US has never had a period over the last 100 years when we said “reading scores are where they should be.”
• There is always a claim of “reading crisis.”
• This is irrespective of how reading is taught.
• NAEP scores, like all standardized test scores, are mostly (60% +) correlated to out-of-school factors.
• NAEP scores only marginally about student achievement/reading, teacher/teaching quality, reading program effectiveness.
• NAEP scores are very pale proxies of reading

Recent rounds of NAEP reading scores, however, are revealing how really bad reading policies (grade retention, intensive systematic phonics for all) can in the short term raise scores while likely deeply harming reading and readers. 4th-grade reading score bumps are mirages.

Equity gap between rich/poor reflected in NAEP reading scores amplifies the reality in the US that the rich get richer while the poor get poorer. Wealth = high achievement; poverty = low achievement. Student outcomes are a consequence of social negligence not student ability.

Placed in recent context of 2017 NAEP reading data and a wider recognition that student demographics (race, socioeconomic status) are historically and currently the greatest causal factors in student standardized test scores, the most fair argument to make in the wake of NAEP 2019 is that the matrix of reading policies and whether or not the policies are implemented at all or well (elements that we do not have data to support in any fashion) cannot be identified as success or failure. We may, however, be able to suggest that focusing on policy, standards, programs, and high-stakes testing simply does not change measurable reading outcomes in positive ways.

If you are fair and careful with the data I am including below, the correlations among all of the factors do not paint any clear picture at all about the effectiveness of programs or policy (again, even if we assume those programs and policies are being implemented at all or well).

Anyone using this data to claim “grade retention works” or “systematic intensive phonics works” is simply being deeply dishonest because no one has done any of the necessary work to tease out those claims in a scientific way (random sampling, controlling for non-instructional factors, investigating fidelity to policies and programs, etc.).

In other words, those advocating for the “science of reading” are making no effort to be scientific themselves in the pursuit of proving if their claims are valid, or not.

None the less, here are the updated data in a manageable chart:

NAEP R 2019 1

NAEP R 2019 2b

NAEP R 2019 3NAEP R 2019 4NAEP R 2019 5NAEP R 2019 6NAEP R 2019 7NAEP R 2019 8

If we genuinely believe a few points here or there, comparing entirely different populations of students under ever-shifting conditions both in their lives and in their education, are in fact not just statistically significant but significant, then we have a wealth of evidence above to suggest that all the standards, testing, and policies are actually degrading student reading achievement.

Finally, I want to stress, the greatest problem exposed by how the media and dyslexia/phonics advocates are responding to NAEP 2019 is that reading is too often a political and ideological football, and students in real classrooms and real lives are being reduced to petty games.

Again, at no point over the past 100 years have the crisis and failure arguments about reading achievement been any different than at this exact moment—regardless of how students have been taught to read (including peak years of intensive phonics and jumbled claims of implementing whole language).

How states mandate and implement reading instruction as well as relentlessly test it in the worst possible formats is a tale of too many cooks in the kitchen, with most of them having no credibility.

See Also

Third-Grade Reading Legislation

retention legislation UPDATED

Third Grade Reading Policies (2012)

From Bruce Baker

 

FYW Students Respond to Warner’s The Writer’s Practice: Progress, Not Perfection

When John Warner released The Writer’s Practice: Building Confidence in Your Nonfiction Writing soon after his Why They Can’t Write, I highly recommended both volumes for those teaching writing (here and here). Of course, a much better barometer of the value of books on how to write and how to teach writing is to put them in the hands of students and teachers.

A bit past midway through this fall semester of 2019, my first-year writing students have just finished reading and reflecting on The Writer’s Practice; the final reflections and our in-class debrief yesterday revealed high praise and key lessons we who teach writing can learn from what my students valued about the book, and how they framed those lessons for them as students and writers.

One student began her reflection as follows:

As I opened up The Writer’s Practice to read the final section, I felt myself experiencing emotions similar to those when I am about to finish a really wonderful piece of fictional writing. This feeling for me is more guttural, and it comes about when I am truly sad or upset to be finishing a piece. Most of the time it is because there is a change in the plot that I didn’t enjoy, or I just truly don’t want the happily ever after to conclude. But I was genuinely confused as to why I experienced this feeling when reading the final few sentences of this piece. Never in my life has I been saddened by the conclusion of reading a book for school, fiction or nonfiction, besides The Catcher and the Rye and now this book. Because of this, I have come to the conclusion that I truly did enjoy reading this book. And I believe it was mostly because of the informal language used throughout. The way in which Warner speaks to the audience is something very intriguing to me, and it made me feel like I was holding a conversation with an actual person instead of reading a boring and monotonous book for school.

In the two sections of FYW I teach, I read in reflections and heard in our discussion some powerful patterns throughout the student endorsements of Warner’s book as an effective textbook for teaching writing. Those patterns include the following:

  • One student expressed relief that Warner encouraged progress, and not perfection. In fact, many students recognized that Warner’s lessons combined with my approach to teaching writing had significantly reduced their levels of stress and anxiety about writing. The overwhelming result of reducing stress and anxiety for students as writers is that they are more eager to write, they produce better early drafts of their essays, and they are also more motivated to revise (and even begin again) in order to produce the essays they want to write. In short, Warner’s messages resonate with and compliment my commitment to having students choose their topics and types of essays as well as my low-stakes approached to teaching (delaying grades, emphasizing feedback and revision, and fostering a writing and learning community among students and with me as a mentor/teacher).
  • Several students responded directly to Warner speaking from his own humble authority as a practicing writer, an authority grounded in his recognition that becoming a writing is a journey, and not a destination. Here, I think, was one of the most powerful aspects of how students praised Warner’s book: students valued that Warner did not speak down to them; they felt respected and appreciate the casual and empathetic tone Warner maintains throughout the book. A common response was that teaching and textbooks can too often be condescending, and students have found that this book and some of the different aspects of my teaching and writing instruction honored their basic human dignity.
  • In terms of nuts-and-bolts writing lessons that were effective because of Warner’s text, I would highlight that students almost universally came to recognize and value the role of having a clear audience at the center of their work as writers. I have always struggled with helping students move away from writing for the teacher/professor (working mostly as compliant students) and toward thinking and working more as writers (drafting for real audiences). Students reading and engaging with Warner’s text while completing my second essay assignment, a public essay that incorporates hyperlinks for citation, was mentioned as very effective experiences for a number of students, especially in terms of writing with an audience in mind and writing by choice about a topic that interests and even invigorates the writer/students.
  • Broader and foundational lessons (ones that are essential for students making the transition from high school to college, from writing like a student to writing as a scholar) include students rethinking what essay writing entails (different disciplines have different expectations for essays), students moving away from rules-based thinking to conventional awareness, and students thinking and working more purposefully through their writing process. The one-size-fits-all effect of the five-paragraph essay (which all of my students bring to the class in some form) and the tyranny of grammar rules and writing mandates definitely were at least strongly confronted if not entirely debunked through the combination of students reading Warner and receiving similar messages in my class.

As a teacher, specifically of writing, this experience with assigning Warner’s The Writer’s Practice has reinforced the importance of the relationship among the teacher, students, and the textbook. When the messages and learning environment are cohesive, like a good piece of writing, everything is more effective.

My initial recommendation for this book is now strongly supported by actually incorporating it into my FYWs, but I must hasten to add a final word of caution as well.

This really positive experience with teaching writing and with students showing real and  often observable learning as writers also exposes one of the most challenging aspects of formal writing instruction and learning in formal schooling: Even as first-year college students, these young writers are at a very early stage of development as writers, and thinkers.

Once again, placing too much emphasis on evaluating my teaching, evaluating the students’ writing, or evaluating the effectiveness of an FYW course is quite dangerous and likely very misleading.

Many of the students expressing effusive praise for Warner stumbled mightily in the writing of those reflections, and continue to struggle in their formal essay assignments. People evaluating as outsiders those students’ writing would likely find it hard to identify the growth (or quality), especially the changes in attitudes these students have experienced about writing forms and the writing process.

For students as writers and me as a teacher of writing, it remains as issue of progress, not perfection.