|
The University of Chicago Library
Kafkas Werke
Published by Chadwyck-Healey, Inc.
|
Help
Bibliographic Searching:
Title: In bibliographic searching, punctuation and
spacing must match exactly that in the online bibliography. Many titles and authors' names contain
accented characters and must be entered as such; however, in order
to enter words without having to pay attention to accents simply turn
on Caps Lock and type in uppercase.
Date: The texts in the database range in date from 1897 to 1924. Some texts are undated; editorial materials are all listed as undated.
Genre: The following options are available.
- diaries = Tagebücher
- Editorial
- proseNSF = Nachgelassene Schriften und Fragmente
- proseDZL = Drucke zu Lebzeiten
- proseROM = Roman
Orthographic Considerations:
The character esset (ß) has been resolved into double s (ss) for easier searching.
Data-Entry Idiosyncrasies:
Several data-entry errors have been found in some databases either from
typesetting errors in the original source or
from rekeying the documents. One should avoid making arguments from
silence. In particular, look out for the
transposition or doubling of letters or word-clustering.
Punctuation and Full-Text
Searching:
Hyphens: Hypens act as word separators. Thus,
one should treat hypenated expressions as separate words excluding the
hyphen (e.g. if searching for Goethe-Schiller, type in Goethe
Schiller.)
Apostrophes: One must include apostrophes when searching words
with apostrophes in them (e.g., only by typing gibt's will one
find "gibt's").
Ampersands: The ampersand (&) is not a searchable
character. Avoid
Phrase Searches where an ampersand may be used as a conjunction.
Full-Text Searching Using PhiloLogic
The term(s) to be searched in selected documents are entered into the
Search for: box on the search-form. Word searches in
PhiloLogic are by default case insensitive, so that a search finds
both lower and upper case representations of words. The user must,
however, take into account diacritics when searching databases that have accented characters. PhiloLogic's wildcard
characters may also be employed to match many forms. The simplest
search in PhiloLogic is a single term search without wildcards. If
searching for a term such as "peccatum" in the database, simply type the word
peccatum into the Search for: box and press the
SEARCH button.
Tip: At this time, only the first 999,000 occurrences of a word are available in the results formats "Occurrences with Context" and "Occurrences Line by Line." Because EEBO-TCP is a very large database, one will encounter this limit with some regularity. One can limit a search by using the bibliographic fields or one can run a Frequency by Title search, from which all occurrences are available.
Boolean Operators
- | (vertical bar):
- serves as the OR operator (e.g., freedom|liberty retrieves
instances of either). Nevertheless, uppercase OR will automatically be converted to the vertical bar during searching.
- ! (exclamation point)
- serves as the NOT operator (e.g., !holy ghost retrieves occurrences of ghost, but not holy ghost, whereas Jesus !Christ finds occurrences of Jesus without Christ). In any case, uppercase NOT will automatically be converted to the exclamation point during searching.
Wildcard Characters in Full-Text Searching
Wildcard characters allow the user to enter a single search entry that may
find many forms. This is in contrast to a simple word search which
requires an exact match in order to find a word. Wildcard characters can be useful, for
example, in identifying cognates made obscure by affixes and vowel
weakening, inconsistencies due to irregular orthography, and variations on
account of word inflection as well as for discovering potential emendations
for uncertain readings. The most commonly used wildcards are listed below.
- . (period):
- matches any single character (e.g., gentlem.n will retrieve
gentleman and gentlemen).
- * (asterisk):
- matches any string of characters, anchoring the match at the beginning
of a word (e.g., cigar* will match cigar, cigars, cigarette,
etc.), anchoring the match at the end of a word (e.g., *habit
will retrieve habit, cohabit, and inhabit), or in the middle (e.g.,
c*eers matches compeers, cheers, and careers).
- .? (period question mark):
- matches the characters entered or the characters entered plus one more
character in place of the question mark (e.g., hono.?r matches
both honor and honour and cat.? matches cat and cats, but not
cathedral, Catherine, etc.). Try co.?templa.ion to match contemplation, contemplacion, co˜templation, co˜templacion, comtemplation, and comtemplacion or ..?onderful to match wonderful and vvonderful.
- [a-z] (square brackets):
- matches a single character found in the specified range (e.g.,
[c-f]at will match cat, dat, eat, and fat) or any letters within
the brackets (e.g., d[e|i]spis[i|y]ng will match despising, despisyng, dispising, and dispisyng).
Tip: If you are using wildcard characters and would like to see a
full list of
the words matching your search-term, then run your search as a
"Frequency by Author" search.
The results page of a "Frequency by Author" search lists all the terms
found in a database that match your search-term.
Accents and Special Characters
PhiloLogic requires that one take into account diacritics when
searching documents with accented characters in both bibliographic and
full-text searching. The system provides three ways to search for accented
characters: 1) simply type the required accented character from the
keyboard; 2) use a capital letter to match all accented and non-accented
forms of a letter; or 3) enter the two character representations listed
below.
Tip: If you do not want to have to think about accents,
turn on "Caps Lock" and type in all uppercase. This is recommended
since accentuation varies: one finds, for example, naivete, naivetè, and naïveté in databases. Be sure to enter and, or, and not in lowercase in phrase searches.
- capital letter = any form of the letter
- (e. g., E matches é ê è ë and e (no accent)
and É Ê È Ë and E (no accent).
- grave = (\) back slash
- (e.g., a\ matches à).
- acute = (/) forward slash
- (e.g., e/ matches é).
- circumflex = (^) caret
- (e.g., e^ matches ê).
- cedilla = (,) comma
- (e.g., c, matches ç).
- ümlaut/dieresis = (") double quote
- (e.g., u" matches ü).
- tilde = (~) tilde
- (e.g., n~ matches ñ).
- ae-ligature (æ) = ae
- the ligature is resolved into two letters. (e.g., to search
æther type in aether).
- oe-ligature () = oe
- the ligature is resolved into two letters. (e.g., to search
conomy type in oeconomy).
Punctuation and Full-Text Searching
All punctuation should be stripped from word searches except for apostrophes. Apostrophes must be entered as characters.
- apostrophe (') = must be entered with a space following.
- (e.g., to search
d'amour type in d' amour.
- hyphen (-) = a space
- the hyphen is not a searchable character. (e.g., to search
capo-mastro type in capo mastro).
- ampersand (&) = should be stripped
- is not a searchable character. Avoid Phrase Searches where an
ampersand could be used as a conjunction.
- period, question mark, exclamation point, and comma = should be stripped
- are not searchable characters.
- parentheses, various brackets, and double quotes = should be stripped
- are not searchable characters and are word-breaking
(e.g., to search vor[r]ia enter vor r ia).
- common mathematical symbols
- the equal sign (=) and minus sign (-) will produce a "Nothing found"
message. The plus sign (+) is not a searchable character, but, if entered,
will be ignored.
Text Formatting
Formatting (e.g., font shifts, superscript, subscript, italics, bold, underline, etc.) are ignored in a search (e.g., search 1st simply as 1st).
Selecting a Search Option:
One may use upper or lower case letters; searches are case insensitive. Wildcards can be used in all search options. Be sure to review sections on accentuation and punctuation in full-text searching.
- Single Term and Phrase Search: To search a single term in the entire database or a defined corpus make sure that the Single Term and Phrase Search
radio button is highlighted, simply enter the term into the Search Text(s) For: box, and press the SEARCH button. Single Term searching supports wildcard
characters and the Boolean OR-operator, which is the vertical bar (|). Entering, for example, freedom|liberty
retrieves all occurrences of the word "freedom" or "liberty" in the entire database or a specified corpus.
Phrase searching restricts the search to
adjacent words in a particular order (punctuation in the text, except for apostrophes, should not be entered).
- Phrase Separated by a Number of Words:
If you are looking for a phrase that could have intervening words, turn on the Separated by radio button and enter the number of words (e.g., "mystery of His body" or "mystery of Christ's body", then enter mystery body).
Note: For better performance it is a good idea to exclude very common words such as "of" in separated phrase searches.
- Proximity Searching in the Same Sentence or Paragraph: Searching for more than one term in a single sentence or paragraph without regard to adjacen
Please note that many texts do not mark paragraphs and so the entire text is indexed as one paragraph. Also, some texts have the sign ¶ instead of a paragraph tag <p>. These signs are not indexed as paragraphs.
Selecting a Results Format:
At the head of any results format one finds the bibliographic criteria limiting one's search, the number of texts searched, the search term(s) entered, and the total number of occurrences of the search term(s) in the database. The number
of occurrences displays at the bottom of the report if PhiloLogic has not detected
the number before generating the first 25 occurrences on the screen.
- Occurrences with Context is the default reporting format option. In this format each occurrence is represented by a short citation consisting of the author's name and the title of the work followed by links to the occurrences within several levels of context such as page, paragraph, scene, act, chapter, body, or contents. Below the citation there is a passage of text consisting of some forty words on either side of the key word, which is highlighted. Clicking on the links takes one to that level of context at which point one finds links to the previous and next sections.
- Occurrences Line by Line (KWIC) is a good format for scanning or printing large result sets since it limits the text displayed to a
single line of text. Each occurrence is represented by its Title ID
with a linked reference to where the term(s) in question occur
within the document. At the bottom of the report one finds the Results
Bibliography, which lists the full references for the Title ID.
- Line by Line (KWIC) Sorted: One can also sort Line by Line results. Under Refined Search Results, click on the radio button and indicate whether the results are to be sorted by the word to the left or to the right of the keyword.
Refined Search Results:
- Frequency by Author, Year, or Title Reports do not display text. They list the number of occurrences in descending order of frequency with the frequency in bold and the rate per 10,000 in square brackets. There is also a link to the digital table of contents for each title and a link to the
occurrences found within that title. At the top of frequency reports one also finds the number of unique forms derived from the search criteria (e.g., clemenc*) within the database and a full list of those unique forms (e.g., clemency | clemencye | clemencie | clemencia).
- Frequency per 10,000 Reports differ in that they list frequency in descending order of rate per 10,000 with [frequency] in brackets (e.g., 4.72 [4]
means 4.72 occurrences in 10,000 words with a total of 4 occurrences in that title, author, or group of years.)
- Collocation Table allows the user to discover lexical
collocations within the database. The user selects one word as the node or
keyword and enters it into the Search for: box.
Wildcards are allowed, but no phrases; single terms
only are permitted. Select the number of words that a given word can be
separated from the keyword (5 words is the default). The program then scans
the concordance entries for the keyword and lists in table format all the
words which occur within the specified distance of the keyword in order of
frequency. The three columns represent words on either side of the keyword,
words to the left of the keyword, and words to the right of the keyword.
- Line by Line (KWIC) Sorted by Keyword allows a user to sort his/her results by the words which locate to the right or left of the keyword. This report does not support phrase searching. It can only be generated for a single word or word patter (e.g., myster*). Results of over 20,000 cannot be sorted.
- Word in Clause Position (Theme/Rheme) A Word in Clause Position Report can only be generated for a single word or word pattern (e.g., concord*). Word positions are calculated on within what percentage of the length of the clause the word falls. Front of Clause (first 35%); Last (last 10%), Remainder (middle 55%), Too Short (clause length 3 words or less). Words of 2 letters or fewer and numbers are excluded in calculating clause length. Please note that clauses are identified with punctuation as the primary determining factor so many unpunctuated clauses will go undetected. This feature is experimental and should be used only as a rough indicator.
The search can take some time for complete results, but one does receive results as they are ready. When the report is finished, click on the link to "Statistical Summary" to see a rough indication of word position.