ARTFL: General Description

ARTFL: A Textual Database

2000 Texts/13th-20th Centuries

Literature, Philosophy, Arts, Sciences...

A Cooperative Project:
  • Centre National de la Recherche Scientifique
  • The University of Chicago

    A Research Tool for Scholars and Students in all Areas of French Studies.

    The ARTFL Project

    In 1957 the French government initiated the creation of a new dictionary of the French language, the Trésor de la Langue Française. In order to provide access to a large body of word samples, it was decided to transcribe an extensive selection of French texts for use with a computer. Twenty years later, a corpus totaling some 150 million words had been created, representing a broad range of written French -- from novels and poetry to biology and mathematics -- stretching from the seventeenth to the twentieth centuries.

    It soon became apparent that this corpus of French texts was an important resource not only for lexicographers, but also for many other types of humanists and social scientists engaged in French studies - on both sides of the Atlantic. The result of this realization was American and French Research on the Treasury of the French Language (ARTFL) -- a cooperative project established in 1981 by the Centre National de la Recherche Scientifique and the University of Chicago.

    The ARTFL project has focused on three objectives over the past eight years:

    The Database

    At present the corpus consists of nearly 2000 texts, ranging from classic works of French literature to various kinds of non-fiction prose and technical writing. The eighteenth, nineteenth and twentieth centuries are about equally represented, with a smaller selection of seventeenth century texts as well as some medieval and Renaissance texts. We have also recently added a Provençal database that includes 38 texts in their original spellings. Genres include novels, verse, theater, journalism, essays, correspondence, and treatises. Subjects include literary criticism, biology, history, economics, and philosophy. In most cases standard scholarly editions were used in converting the text into machine-readable form, and the data contain page references to these editions.

    New Opportunities for Research

    The ARTFL database is one of the largest of its kind in the world. The number, variety and historical range of its texts allow researchers to go well beyond the usual narrow focus on single works or single authors. The database permits both the rapid exploration of single texts, and the inter-textual research of a kind virtually impossible without the aid of a computer.

    ARTFL on the Web

    In implementing the World Wide Web, the ARTFL project has sought to keep two goals in mind:

    With the introduction of ARTFL on the Web, researchers have a new and easier way to access the database. This system is available to all subscribers who have access to the Internet. The WWW provides a user-friendly environment that eliminates the need to memorize special commands and addresses. Instead, users simply click the mouse button on the appropriate icon or word in order to proceed from one step to the next in their search of the ARTFL database. In addition, the ARTFL home page presents a variety of other options that were not previously available on the PhiloLogic system. Now users can read newsletters, consult FAQ sheets, and skim the ARTFL bibliography on-line.

    The PhiloLogic System

    Users familiar with our original system will have continued access to the ARTFL database through the PhiloLogic System. This is a second tool for ARTFL research which provides a menu driven system featuring a sophisticated help program that can be accessed at any time. Users can log-on to PhiloLogic by using a standard microcomputer or terminal and modem. PhiloLogic does not require previous computer experience -- in fact, this system provides an excellent opportunity to become acquainted with the possibilities of computer-assisted research and teaching. The ARTFL Project has written full documentation for PhiloLogic, including tutorials demonstrating the system from logging-on to printing of results.

    Text Selection

    Both WWW and PhiloLogic provide serveral ways for users to select the texts they wish to analyze. Users may search a single text, texts by a single author, texts from a particular time period, texts with a particular word in the title, or all the texts in the database. For example, one might wish to work with all the texts of Balzac in the database, or all the texts published between 1750 and 1789. A single command will select these texts for further analysis.

    Queries

    The ARTFL system supports a number of searches which can be performed on the texts selected by users working with WWW or Philologic. A user may search for a single word, a word root, prefix, suffix or a list of words created by the user. For example, one might search for the word liberté in the texts published between 1789 and 1794, or all of the words associated with "artist" -- artiste, artistes, écrivain, écrivains, poète, poètes, etc -- in the works of Zola. In many cases a researcher will not merely be interested in the occurrences of single words or lists of words, but where words occur in texts. Both WWW and PhiloLogic allow the user to search for logical combinations of words and word lists. One might, for example, search for all the occurrences of words associated with "artist" where words beginning with "fem" -- femme, femmes, feministe, etc. -- are found in the same sentence in the works of Zola. In addition, users of the WWW can search their texts for inflected verbs.

    Display

    Several display formats are available to the user of WWW and PhiloLogic. With both systems, results can be displayed on screen line by line, with the search word highlighted or centered. The user may browse through the full context of any result, examining many sentences or paragraphs around the target of the search. Both the WWW and PhiloLogic systems display the bibliographic information and page number for each occurrence and can sort the results on screen by date, author name, keywords and other fields.

    Output

    Since a single search can yield many interesting results which cannot be examined on screen in a single sitting, both WWW and PhiloLogic have a broad variety of output formats and delivery methods. Users can generate single line KWIC concordances, multi-line concordances, indices, and bibliographies. Search results on the WWW can be saved directly to the directory of your choice. The output produced by a PhiloLogic search can be: These delivery methods are aimed at reducing the telecommunications charges as much as possible.

    Access To The ARTFL Database

    Access to the database is organized through a consortium of user institutions, in most cases universities and colleges, each of which pay an annual subscription fee. In 1995, this fee is $500 (US) for PhD granting institutions and $250 (US) for other universities and colleges. All scholars and students at affiliated institutions have access to the database. Each individual user of the PhiloLogic system is issued an account on the ARTFL computer upon registering with ARTFL. There is no charge to users for storage of texts, computer time or electronic delivery of texts. Users pay only for printing done at the University of Chicago and telephone charges.

    Options for Users

    The systems described here are designed to deal with most of the requests made by individual researchers who perform the work themselves. WWW and PhiloLogic are flexible enough to allow for other possibilities. A scholar, for example, may want to have specialized research performed by an ARTFL research assistant (there is a charge for this service). The database may also be used for pedagogical purposes using computer accounts open to an entire class of students. ARTFL staff will consult closely with researchers to tailor the system to their needs.

    Future Development

    Both the CNRS and the University of Chicago are committed to the future growth of the ARTFL Project. These activities include expansion of the size of the database, correction of texts already in the database, and continued development of access and analysis software. Recent additions to the ARTFL database include the collected works of Beckett, and texts by Montaigne and Maupassant. Other texts now available to users include a French translation of the Bible by Louis Segond and the Trésor de la langue française dictionary (1606). ARTFL has also made increased efforts in the area of imaging. We are currently in the process of scanning various texts and images into the ARTFL database, including the Encyclopédie, French revolutionary pamphlets and illuminated manuscripts, as well as the Divine Comedy of Dante. The Project has obtained many important texts from other scholars and welcomes new contributions and proposals for collecting more texts. ARTFL expects to continue improving its research systems and plans to develop new analytical tools as well. We welcome joint projects with other institutions and invite you to contact us to discuss possible collaborations. Users at member universities will continue to play an important role in providing direction to the future development of the ARTFL Project.

    More Information

    The ARTFL Project is supported by a full-time staff at the University of Chicago. We encourage you to write or call us with any questions you may have about the project, such as the availability of texts, operation of the system, or the costs of using the database.

    The ARTFL Project
    American and French Research on the
    Treasury of the French Language
    Department of Romance Languages and Literatures
    University of Chicago
    1050 East 59th Street
    Chicago Illinois  60637
    (773) 702-8488
    electronic mail: mark@barkov.uchicago.edu
    WWW: http://humanities.uchicago.edu/ARTFL/ARTFL.html
    gopher: gopher://gopher.uchicago.edu/11ucholarly/artfl