Python数据处理

副标题：无

作者：杰奎琳·卡兹奥

分类号：

ISBN：9787564170035

收录收藏 (0) 评论纠错

微信扫一扫,移动浏览光盘

简介

简介

如何让你的数据分析技能**Excel到*高的水平？通过深入学习Pvthon来完成工作。杰奎琳·卡兹奥、凯瑟琳·嘉穆编*的《Python数据处理(影印版)(英文版)》向非程序员展示了如何处理本质上太杂乱或难以下手的信息。你不需要了解Pvthon编程语言基础知识就可以开始了。
通过循序渐进的练习，你将学习如何有效地获取、清理、分析和呈现数据。你还将了解如何将数据处理自动化，安排文件编辑和清理任务，处理*大的数据集，以及使用所获取的数据讲述引人注目的故事。
快速学习基本的Python语法、数据类型和语言概念使用机器可读和人类可用的数据抓取网站和API以查找大量有用的信息清理和格式化数据以消除数据集里的重复和错误数据了解何时标准化数据以及何时测试和编写脚本进行数据清理使用新的Python库和技术探索并分析数据集使用Python解决方案自动化整个数据处理过程

Preface1. Introduction to PythonWhy PythonGetting Started with PythonWhich Python VersionSetting Up Python on Your MachineTest Driving PythonInstall pipInstall a Code EditorOptional: Install IPythonSummary2. Python BasicsBasic Data TypesStringsIntegers and FloatsData ContainersVariablesListsDictionariesWhat Can the Various Data Types Do?String Methods: Things Strings Can DoNumerical Methods: Things Numbers Can DoList Methods: Things Lists Can DoDictionary Methods: Things Dictionaries Can DoHelpful Tools: type, dir, and helptypedirhelpPutting It All TogetherWhat Does It All Mean?Summary3. Data Meant to Be Read by MachinesCSV DataHow to Import CSV DataSaving the Code to a File; Running from Command LineJSON DataHow to Import ]SON DataXML DataHow to Import XML DataSummary4. Working with Excel FilesInstalling Python PackagesParsing Excel FilesGetting Started with ParsingSummary5. PDFs and Problem Solving in PythonAvoid Using PDFs!Programmatic Approaches to PDF ParsingOpening and Reading Using slateConverting PDF to TextParsing PDFs Using pdfminerLearning How to Solve ProblemsExercise: Use Table Extraction, Try a Different LibraryExercise: Clean the Data ManuallyExercise: Try Another ToolUncommon File TypesSummary6. Acquiring and Storing DataNot All Data Is Created EqualFact CheckingReadability, Cleanliness, and LongevityWhere to Find DataUsing a TelephoneUS Government DataGovernment and Civic Open Data WorldwideOrganization and Non-Government Organization (NGO) DataEducation and University DataMedical and Scientific DataCrowdsourced Data and APIsCase Studies: Example Data InvestigationEbola CrisisTrain SafetyFootball SalariesChild LaborStoring Your Data: When, Why, and How?Databases: A Brief IntroductionRelational Databases: MySQL and PostgreSQLNon-Relational Databases: NoSQLSetting Up Your Local Database with PythonWhen to Use a Simple FileCloud-Storage and PythonLocal Storage and PythonAlternative Data StorageSummary7. Data Cleanup: Investigation, Matching, and FormattingWhy Clean Data?Data Cleanup BasicsIdentifying Values for Data CleanupFormatting DataFinding Outliers and Bad DataFinding DuplicatesFuzzy MatchingRegEx MatchingWhat to Do with Duplicate RecordsSummary8. Data Cleanup: Standardizing and ScriptingNormalizing and Standardizing Your DataSaving Your DataDetermining What Data Cleanup Is Right for Your ProjectScripting Your CleanupTesting with New DataSummary9. Data Exploration and AnalysisExploring Your DataImporting DataExploring Table FunctionsJoining Numerous DatasetsIdentifying CorrelationsIdentifying OutliersCreating GroupingsFurther ExplorationAnalyzing Your DataSeparating and Focusing Your DataWhat Is Your Data Saying?Drawing ConclusionsDocumenting Your ConclusionsSummary10. Presenting Your DataAvoiding Storytelling PitfallsHow Will You Tell the Story?Know Your AudienceVisualizing Your DataChartsTime-Related DataMapsInteractivesWordsImages, Video, and IllustrationsPresentation ToolsPublishing Your DataUsing Available SitesOpen Source Platforms: Starting a New SiteJupyter (Formerly Known as IPython Notebooks)Summary11. Web Scraping: Acquiring and Storing Data from the WebWhat to Scrape and HowAnalyzing a Web PageInspection: Markup StructureNetwork/Timeline: How the Page LoadsConsole: Interacting with JavaScriptIn-Depth Analysis of a PageGetting Pages: How to Request on the InternetReading a Web Page with Beautiful SoupReading a Web Page with LXMLA Case for XPathSummary12. Advanced Web Scraping: Screen Scrapers and SpidersBrowser-Based ParsingScreen Reading with SeleniumScreen Reading with Ghost.PySpidering the WebBuilding a Spider with ScrapyCrawling Whole Websites with ScrapyNetworks: How the Internet Works and Why It's Breaking Your ScriptThe Changing Web (or Why Your Script Broke)A (Few) Word(s) of CautionSummary13. APIsAPI FeaturesREST Versus Streaming APIsRate LimitsTiered Data VolumesAPI Keys and TokensA Simple Data Pull from Twitter's REST APIAdvanced Data Collection from Twitter's REST APIAdvanced Data Collection from Twitter's Streaming APISummary14. Automation and ScalingWhy Automate?Steps to AutomateWhat Could Go Wrong?Where to AutomateSpecial Tools for AutomationUsing Local Files, argv, and Config FilesUsing the Cloud for Data ProcessingUsing Parallel ProcessingUsing Distributed ProcessingSimple AutomationCronJobsWeb InterfacesJupyter NotebooksLarge-Scale AutomationCelery: Queue-Based AutomationAnsible: Operations AutomationMonitoring Your AutomationPython LoggingAdding Automated MessagingUploading and Other ReportingLogging and Monitoring as a ServiceNo System Is FoolproofSummary15. ConclusionDuties of a Data WranglerBeyond Data WranglingBecome a Better Data AnalystBecome a Better DeveloperBecome a Better Visual StorytellerBecome a Better Systems ArchitectWhere Do You Go from Here?A. Comparison of Languages MentionedB. Python Resources for BeginnersC. Learning the Command LineD. Advanced Python SetupE. Python GotchasF. IPython HintsG. Using Amazon Web ServicesIndex
【深度学习】

已确认勘误

页码	勘误内容	提交人	修订印次

Python数据处理

名称
类型
大小

用户反馈

FAQ

Python数据处理

已确认勘误

第次印刷 筛选

第次印刷